git.saurik.com Git - bison.git/blame_incremental

0 / 11332 ( 0%)

Commit	Line	Data
	1	\input texinfo @c --texinfo--
	2	@comment %**start of header
	3	@setfilename bison.info
	4	@include version.texi
	5	@settitle Bison @value{VERSION}
	6	@setchapternewpage odd
	7
	8	@finalout
	9
	10	@c SMALL BOOK version
	11	@c This edition has been formatted so that you can format and print it in
	12	@c the smallbook format.
	13	@c @smallbook
	14
	15	@c Set following if you want to document %default-prec and %no-default-prec.
	16	@c This feature is experimental and may change in future Bison versions.
	17	@c @set defaultprec
	18
	19	@ifnotinfo
	20	@syncodeindex fn cp
	21	@syncodeindex vr cp
	22	@syncodeindex tp cp
	23	@end ifnotinfo
	24	@ifinfo
	25	@synindex fn cp
	26	@synindex vr cp
	27	@synindex tp cp
	28	@end ifinfo
	29	@comment %**end of header
	30
	31	@copying
	32
	33	This manual (@value{UPDATED}) is for GNU Bison (version
	34	@value{VERSION}), the GNU parser generator.
	35
	36	Copyright @copyright{} 1988-1993, 1995, 1998-2012 Free Software
	37	Foundation, Inc.
	38
	39	@quotation
	40	Permission is granted to copy, distribute and/or modify this document
	41	under the terms of the GNU Free Documentation License,
	42	Version 1.3 or any later version published by the Free Software
	43	Foundation; with no Invariant Sections, with the Front-Cover texts
	44	being ``A GNU Manual,'' and with the Back-Cover Texts as in
	45	(a) below. A copy of the license is included in the section entitled
	46	``GNU Free Documentation License.''
	47
	48	(a) The FSF's Back-Cover Text is: ``You have the freedom to copy and
	49	modify this GNU manual. Buying copies from the FSF
	50	supports it in developing GNU and promoting software
	51	freedom.''
	52	@end quotation
	53	@end copying
	54
	55	@dircategory Software development
	56	@direntry
	57	* bison: (bison). GNU parser generator (Yacc replacement).
	58	@end direntry
	59
	60	@titlepage
	61	@title Bison
	62	@subtitle The Yacc-compatible Parser Generator
	63	@subtitle @value{UPDATED}, Bison Version @value{VERSION}
	64
	65	@author by Charles Donnelly and Richard Stallman
	66
	67	@page
	68	@vskip 0pt plus 1filll
	69	@insertcopying
	70	@sp 2
	71	Published by the Free Software Foundation @*
	72	51 Franklin Street, Fifth Floor @*
	73	Boston, MA 02110-1301 USA @*
	74	Printed copies are available from the Free Software Foundation.@*
	75	ISBN 1-882114-44-2
	76	@sp 2
	77	Cover art by Etienne Suvasa.
	78	@end titlepage
	79
	80	@contents
	81
	82	@ifnottex
	83	@node Top
	84	@top Bison
	85	@insertcopying
	86	@end ifnottex
	87
	88	@menu
	89	* Introduction::
	90	* Conditions::
	91	* Copying:: The GNU General Public License says
	92	how you can copy and share Bison.
	93
	94	Tutorial sections:
	95	* Concepts:: Basic concepts for understanding Bison.
	96	* Examples:: Three simple explained examples of using Bison.
	97
	98	Reference sections:
	99	* Grammar File:: Writing Bison declarations and rules.
	100	* Interface:: C-language interface to the parser function @code{yyparse}.
	101	* Algorithm:: How the Bison parser works at run-time.
	102	* Error Recovery:: Writing rules for error recovery.
	103	* Context Dependency:: What to do if your language syntax is too
	104	messy for Bison to handle straightforwardly.
	105	* Debugging:: Understanding or debugging Bison parsers.
	106	* Invocation:: How to run Bison (to produce the parser implementation).
	107	* Other Languages:: Creating C++ and Java parsers.
	108	* FAQ:: Frequently Asked Questions
	109	* Table of Symbols:: All the keywords of the Bison language are explained.
	110	* Glossary:: Basic concepts are explained.
	111	* Copying This Manual:: License for copying this manual.
	112	* Bibliography:: Publications cited in this manual.
	113	* Index:: Cross-references to the text.
	114
	115	@detailmenu
	116	--- The Detailed Node Listing ---
	117
	118	The Concepts of Bison
	119
	120	* Language and Grammar:: Languages and context-free grammars,
	121	as mathematical ideas.
	122	* Grammar in Bison:: How we represent grammars for Bison's sake.
	123	* Semantic Values:: Each token or syntactic grouping can have
	124	a semantic value (the value of an integer,
	125	the name of an identifier, etc.).
	126	* Semantic Actions:: Each rule can have an action containing C code.
	127	* GLR Parsers:: Writing parsers for general context-free languages.
	128	* Locations:: Overview of location tracking.
	129	* Bison Parser:: What are Bison's input and output,
	130	how is the output used?
	131	* Stages:: Stages in writing and running Bison grammars.
	132	* Grammar Layout:: Overall structure of a Bison grammar file.
	133
	134	Writing GLR Parsers
	135
	136	* Simple GLR Parsers:: Using GLR parsers on unambiguous grammars.
	137	* Merging GLR Parses:: Using GLR parsers to resolve ambiguities.
	138	* GLR Semantic Actions:: Deferred semantic actions have special concerns.
	139	* Compiler Requirements:: GLR parsers require a modern C compiler.
	140
	141	Examples
	142
	143	* RPN Calc:: Reverse polish notation calculator;
	144	a first example with no operator precedence.
	145	* Infix Calc:: Infix (algebraic) notation calculator.
	146	Operator precedence is introduced.
	147	* Simple Error Recovery:: Continuing after syntax errors.
	148	* Location Tracking Calc:: Demonstrating the use of @@@var{n} and @@$.
	149	* Multi-function Calc:: Calculator with memory and trig functions.
	150	It uses multiple data-types for semantic values.
	151	* Exercises:: Ideas for improving the multi-function calculator.
	152
	153	Reverse Polish Notation Calculator
	154
	155	* Rpcalc Declarations:: Prologue (declarations) for rpcalc.
	156	* Rpcalc Rules:: Grammar Rules for rpcalc, with explanation.
	157	* Rpcalc Lexer:: The lexical analyzer.
	158	* Rpcalc Main:: The controlling function.
	159	* Rpcalc Error:: The error reporting function.
	160	* Rpcalc Generate:: Running Bison on the grammar file.
	161	* Rpcalc Compile:: Run the C compiler on the output code.
	162
	163	Grammar Rules for @code{rpcalc}
	164
	165	* Rpcalc Input::
	166	* Rpcalc Line::
	167	* Rpcalc Expr::
	168
	169	Location Tracking Calculator: @code{ltcalc}
	170
	171	* Ltcalc Declarations:: Bison and C declarations for ltcalc.
	172	* Ltcalc Rules:: Grammar rules for ltcalc, with explanations.
	173	* Ltcalc Lexer:: The lexical analyzer.
	174
	175	Multi-Function Calculator: @code{mfcalc}
	176
	177	* Mfcalc Declarations:: Bison declarations for multi-function calculator.
	178	* Mfcalc Rules:: Grammar rules for the calculator.
	179	* Mfcalc Symbol Table:: Symbol table management subroutines.
	180
	181	Bison Grammar Files
	182
	183	* Grammar Outline:: Overall layout of the grammar file.
	184	* Symbols:: Terminal and nonterminal symbols.
	185	* Rules:: How to write grammar rules.
	186	* Recursion:: Writing recursive rules.
	187	* Semantics:: Semantic values and actions.
	188	* Tracking Locations:: Locations and actions.
	189	* Named References:: Using named references in actions.
	190	* Declarations:: All kinds of Bison declarations are described here.
	191	* Multiple Parsers:: Putting more than one Bison parser in one program.
	192
	193	Outline of a Bison Grammar
	194
	195	* Prologue:: Syntax and usage of the prologue.
	196	* Prologue Alternatives:: Syntax and usage of alternatives to the prologue.
	197	* Bison Declarations:: Syntax and usage of the Bison declarations section.
	198	* Grammar Rules:: Syntax and usage of the grammar rules section.
	199	* Epilogue:: Syntax and usage of the epilogue.
	200
	201	Defining Language Semantics
	202
	203	* Value Type:: Specifying one data type for all semantic values.
	204	* Multiple Types:: Specifying several alternative data types.
	205	* Actions:: An action is the semantic definition of a grammar rule.
	206	* Action Types:: Specifying data types for actions to operate on.
	207	* Mid-Rule Actions:: Most actions go at the end of a rule.
	208	This says when, why and how to use the exceptional
	209	action in the middle of a rule.
	210
	211	Tracking Locations
	212
	213	* Location Type:: Specifying a data type for locations.
	214	* Actions and Locations:: Using locations in actions.
	215	* Location Default Action:: Defining a general way to compute locations.
	216
	217	Bison Declarations
	218
	219	* Require Decl:: Requiring a Bison version.
	220	* Token Decl:: Declaring terminal symbols.
	221	* Precedence Decl:: Declaring terminals with precedence and associativity.
	222	* Union Decl:: Declaring the set of all semantic value types.
	223	* Type Decl:: Declaring the choice of type for a nonterminal symbol.
	224	* Initial Action Decl:: Code run before parsing starts.
	225	* Destructor Decl:: Declaring how symbols are freed.
	226	* Expect Decl:: Suppressing warnings about parsing conflicts.
	227	* Start Decl:: Specifying the start symbol.
	228	* Pure Decl:: Requesting a reentrant parser.
	229	* Push Decl:: Requesting a push parser.
	230	* Decl Summary:: Table of all Bison declarations.
	231	* %define Summary:: Defining variables to adjust Bison's behavior.
	232	* %code Summary:: Inserting code into the parser source.
	233
	234	Parser C-Language Interface
	235
	236	* Parser Function:: How to call @code{yyparse} and what it returns.
	237	* Push Parser Function:: How to call @code{yypush_parse} and what it returns.
	238	* Pull Parser Function:: How to call @code{yypull_parse} and what it returns.
	239	* Parser Create Function:: How to call @code{yypstate_new} and what it returns.
	240	* Parser Delete Function:: How to call @code{yypstate_delete} and what it returns.
	241	* Lexical:: You must supply a function @code{yylex}
	242	which reads tokens.
	243	* Error Reporting:: You must supply a function @code{yyerror}.
	244	* Action Features:: Special features for use in actions.
	245	* Internationalization:: How to let the parser speak in the user's
	246	native language.
	247
	248	The Lexical Analyzer Function @code{yylex}
	249
	250	* Calling Convention:: How @code{yyparse} calls @code{yylex}.
	251	* Token Values:: How @code{yylex} must return the semantic value
	252	of the token it has read.
	253	* Token Locations:: How @code{yylex} must return the text location
	254	(line number, etc.) of the token, if the
	255	actions want that.
	256	* Pure Calling:: How the calling convention differs in a pure parser
	257	(@pxref{Pure Decl, ,A Pure (Reentrant) Parser}).
	258
	259	The Bison Parser Algorithm
	260
	261	* Lookahead:: Parser looks one token ahead when deciding what to do.
	262	* Shift/Reduce:: Conflicts: when either shifting or reduction is valid.
	263	* Precedence:: Operator precedence works by resolving conflicts.
	264	* Contextual Precedence:: When an operator's precedence depends on context.
	265	* Parser States:: The parser is a finite-state-machine with stack.
	266	* Reduce/Reduce:: When two rules are applicable in the same situation.
	267	* Mysterious Conflicts:: Conflicts that look unjustified.
	268	* Tuning LR:: How to tune fundamental aspects of LR-based parsing.
	269	* Generalized LR Parsing:: Parsing arbitrary context-free grammars.
	270	* Memory Management:: What happens when memory is exhausted. How to avoid it.
	271
	272	Operator Precedence
	273
	274	* Why Precedence:: An example showing why precedence is needed.
	275	* Using Precedence:: How to specify precedence in Bison grammars.
	276	* Precedence Examples:: How these features are used in the previous example.
	277	* How Precedence:: How they work.
	278
	279	Tuning LR
	280
	281	* LR Table Construction:: Choose a different construction algorithm.
	282	* Default Reductions:: Disable default reductions.
	283	* LAC:: Correct lookahead sets in the parser states.
	284	* Unreachable States:: Keep unreachable parser states for debugging.
	285
	286	Handling Context Dependencies
	287
	288	* Semantic Tokens:: Token parsing can depend on the semantic context.
	289	* Lexical Tie-ins:: Token parsing can depend on the syntactic context.
	290	* Tie-in Recovery:: Lexical tie-ins have implications for how
	291	error recovery rules must be written.
	292
	293	Debugging Your Parser
	294
	295	* Understanding:: Understanding the structure of your parser.
	296	* Tracing:: Tracing the execution of your parser.
	297
	298	Invoking Bison
	299
	300	* Bison Options:: All the options described in detail,
	301	in alphabetical order by short options.
	302	* Option Cross Key:: Alphabetical list of long options.
	303	* Yacc Library:: Yacc-compatible @code{yylex} and @code{main}.
	304
	305	Parsers Written In Other Languages
	306
	307	* C++ Parsers:: The interface to generate C++ parser classes
	308	* Java Parsers:: The interface to generate Java parser classes
	309
	310	C++ Parsers
	311
	312	* C++ Bison Interface:: Asking for C++ parser generation
	313	* C++ Semantic Values:: %union vs. C++
	314	* C++ Location Values:: The position and location classes
	315	* C++ Parser Interface:: Instantiating and running the parser
	316	* C++ Scanner Interface:: Exchanges between yylex and parse
	317	* A Complete C++ Example:: Demonstrating their use
	318
	319	A Complete C++ Example
	320
	321	* Calc++ --- C++ Calculator:: The specifications
	322	* Calc++ Parsing Driver:: An active parsing context
	323	* Calc++ Parser:: A parser class
	324	* Calc++ Scanner:: A pure C++ Flex scanner
	325	* Calc++ Top Level:: Conducting the band
	326
	327	Java Parsers
	328
	329	* Java Bison Interface:: Asking for Java parser generation
	330	* Java Semantic Values:: %type and %token vs. Java
	331	* Java Location Values:: The position and location classes
	332	* Java Parser Interface:: Instantiating and running the parser
	333	* Java Scanner Interface:: Specifying the scanner for the parser
	334	* Java Action Features:: Special features for use in actions
	335	* Java Differences:: Differences between C/C++ and Java Grammars
	336	* Java Declarations Summary:: List of Bison declarations used with Java
	337
	338	Frequently Asked Questions
	339
	340	* Memory Exhausted:: Breaking the Stack Limits
	341	* How Can I Reset the Parser:: @code{yyparse} Keeps some State
	342	* Strings are Destroyed:: @code{yylval} Loses Track of Strings
	343	* Implementing Gotos/Loops:: Control Flow in the Calculator
	344	* Multiple start-symbols:: Factoring closely related grammars
	345	* Secure? Conform?:: Is Bison POSIX safe?
	346	* I can't build Bison:: Troubleshooting
	347	* Where can I find help?:: Troubleshouting
	348	* Bug Reports:: Troublereporting
	349	* More Languages:: Parsers in C++, Java, and so on
	350	* Beta Testing:: Experimenting development versions
	351	* Mailing Lists:: Meeting other Bison users
	352
	353	Copying This Manual
	354
	355	* Copying This Manual:: License for copying this manual.
	356
	357	@end detailmenu
	358	@end menu
	359
	360	@node Introduction
	361	@unnumbered Introduction
	362	@cindex introduction
	363
	364	@dfn{Bison} is a general-purpose parser generator that converts an
	365	annotated context-free grammar into a deterministic LR or generalized
	366	LR (GLR) parser employing LALR(1) parser tables. As an experimental
	367	feature, Bison can also generate IELR(1) or canonical LR(1) parser
	368	tables. Once you are proficient with Bison, you can use it to develop
	369	a wide range of language parsers, from those used in simple desk
	370	calculators to complex programming languages.
	371
	372	Bison is upward compatible with Yacc: all properly-written Yacc
	373	grammars ought to work with Bison with no change. Anyone familiar
	374	with Yacc should be able to use Bison with little trouble. You need
	375	to be fluent in C or C++ programming in order to use Bison or to
	376	understand this manual. Java is also supported as an experimental
	377	feature.
	378
	379	We begin with tutorial chapters that explain the basic concepts of
	380	using Bison and show three explained examples, each building on the
	381	last. If you don't know Bison or Yacc, start by reading these
	382	chapters. Reference chapters follow, which describe specific aspects
	383	of Bison in detail.
	384
	385	Bison was written originally by Robert Corbett. Richard Stallman made
	386	it Yacc-compatible. Wilfred Hansen of Carnegie Mellon University
	387	added multi-character string literals and other features. Since then,
	388	Bison has grown more robust and evolved many other new features thanks
	389	to the hard work of a long list of volunteers. For details, see the
	390	@file{THANKS} and @file{ChangeLog} files included in the Bison
	391	distribution.
	392
	393	This edition corresponds to version @value{VERSION} of Bison.
	394
	395	@node Conditions
	396	@unnumbered Conditions for Using Bison
	397
	398	The distribution terms for Bison-generated parsers permit using the
	399	parsers in nonfree programs. Before Bison version 2.2, these extra
	400	permissions applied only when Bison was generating LALR(1)
	401	parsers in C@. And before Bison version 1.24, Bison-generated
	402	parsers could be used only in programs that were free software.
	403
	404	The other GNU programming tools, such as the GNU C
	405	compiler, have never
	406	had such a requirement. They could always be used for nonfree
	407	software. The reason Bison was different was not due to a special
	408	policy decision; it resulted from applying the usual General Public
	409	License to all of the Bison source code.
	410
	411	The main output of the Bison utility---the Bison parser implementation
	412	file---contains a verbatim copy of a sizable piece of Bison, which is
	413	the code for the parser's implementation. (The actions from your
	414	grammar are inserted into this implementation at one point, but most
	415	of the rest of the implementation is not changed.) When we applied
	416	the GPL terms to the skeleton code for the parser's implementation,
	417	the effect was to restrict the use of Bison output to free software.
	418
	419	We didn't change the terms because of sympathy for people who want to
	420	make software proprietary. @strong{Software should be free.} But we
	421	concluded that limiting Bison's use to free software was doing little to
	422	encourage people to make other software free. So we decided to make the
	423	practical conditions for using Bison match the practical conditions for
	424	using the other GNU tools.
	425
	426	This exception applies when Bison is generating code for a parser.
	427	You can tell whether the exception applies to a Bison output file by
	428	inspecting the file for text beginning with ``As a special
	429	exception@dots{}''. The text spells out the exact terms of the
	430	exception.
	431
	432	@node Copying
	433	@unnumbered GNU GENERAL PUBLIC LICENSE
	434	@include gpl-3.0.texi
	435
	436	@node Concepts
	437	@chapter The Concepts of Bison
	438
	439	This chapter introduces many of the basic concepts without which the
	440	details of Bison will not make sense. If you do not already know how to
	441	use Bison or Yacc, we suggest you start by reading this chapter carefully.
	442
	443	@menu
	444	* Language and Grammar:: Languages and context-free grammars,
	445	as mathematical ideas.
	446	* Grammar in Bison:: How we represent grammars for Bison's sake.
	447	* Semantic Values:: Each token or syntactic grouping can have
	448	a semantic value (the value of an integer,
	449	the name of an identifier, etc.).
	450	* Semantic Actions:: Each rule can have an action containing C code.
	451	* GLR Parsers:: Writing parsers for general context-free languages.
	452	* Locations:: Overview of location tracking.
	453	* Bison Parser:: What are Bison's input and output,
	454	how is the output used?
	455	* Stages:: Stages in writing and running Bison grammars.
	456	* Grammar Layout:: Overall structure of a Bison grammar file.
	457	@end menu
	458
	459	@node Language and Grammar
	460	@section Languages and Context-Free Grammars
	461
	462	@cindex context-free grammar
	463	@cindex grammar, context-free
	464	In order for Bison to parse a language, it must be described by a
	465	@dfn{context-free grammar}. This means that you specify one or more
	466	@dfn{syntactic groupings} and give rules for constructing them from their
	467	parts. For example, in the C language, one kind of grouping is called an
	468	`expression'. One rule for making an expression might be, ``An expression
	469	can be made of a minus sign and another expression''. Another would be,
	470	``An expression can be an integer''. As you can see, rules are often
	471	recursive, but there must be at least one rule which leads out of the
	472	recursion.
	473
	474	@cindex BNF
	475	@cindex Backus-Naur form
	476	The most common formal system for presenting such rules for humans to read
	477	is @dfn{Backus-Naur Form} or ``BNF'', which was developed in
	478	order to specify the language Algol 60. Any grammar expressed in
	479	BNF is a context-free grammar. The input to Bison is
	480	essentially machine-readable BNF.
	481
	482	@cindex LALR grammars
	483	@cindex IELR grammars
	484	@cindex LR grammars
	485	There are various important subclasses of context-free grammars. Although
	486	it can handle almost all context-free grammars, Bison is optimized for what
	487	are called LR(1) grammars. In brief, in these grammars, it must be possible
	488	to tell how to parse any portion of an input string with just a single token
	489	of lookahead. For historical reasons, Bison by default is limited by the
	490	additional restrictions of LALR(1), which is hard to explain simply.
	491	@xref{Mysterious Conflicts}, for more information on this. As an
	492	experimental feature, you can escape these additional restrictions by
	493	requesting IELR(1) or canonical LR(1) parser tables. @xref{LR Table
	494	Construction}, to learn how.
	495
	496	@cindex GLR parsing
	497	@cindex generalized LR (GLR) parsing
	498	@cindex ambiguous grammars
	499	@cindex nondeterministic parsing
	500
	501	Parsers for LR(1) grammars are @dfn{deterministic}, meaning
	502	roughly that the next grammar rule to apply at any point in the input is
	503	uniquely determined by the preceding input and a fixed, finite portion
	504	(called a @dfn{lookahead}) of the remaining input. A context-free
	505	grammar can be @dfn{ambiguous}, meaning that there are multiple ways to
	506	apply the grammar rules to get the same inputs. Even unambiguous
	507	grammars can be @dfn{nondeterministic}, meaning that no fixed
	508	lookahead always suffices to determine the next grammar rule to apply.
	509	With the proper declarations, Bison is also able to parse these more
	510	general context-free grammars, using a technique known as GLR
	511	parsing (for Generalized LR). Bison's GLR parsers
	512	are able to handle any context-free grammar for which the number of
	513	possible parses of any given string is finite.
	514
	515	@cindex symbols (abstract)
	516	@cindex token
	517	@cindex syntactic grouping
	518	@cindex grouping, syntactic
	519	In the formal grammatical rules for a language, each kind of syntactic
	520	unit or grouping is named by a @dfn{symbol}. Those which are built by
	521	grouping smaller constructs according to grammatical rules are called
	522	@dfn{nonterminal symbols}; those which can't be subdivided are called
	523	@dfn{terminal symbols} or @dfn{token types}. We call a piece of input
	524	corresponding to a single terminal symbol a @dfn{token}, and a piece
	525	corresponding to a single nonterminal symbol a @dfn{grouping}.
	526
	527	We can use the C language as an example of what symbols, terminal and
	528	nonterminal, mean. The tokens of C are identifiers, constants (numeric
	529	and string), and the various keywords, arithmetic operators and
	530	punctuation marks. So the terminal symbols of a grammar for C include
	531	`identifier', `number', `string', plus one symbol for each keyword,
	532	operator or punctuation mark: `if', `return', `const', `static', `int',
	533	`char', `plus-sign', `open-brace', `close-brace', `comma' and many more.
	534	(These tokens can be subdivided into characters, but that is a matter of
	535	lexicography, not grammar.)
	536
	537	Here is a simple C function subdivided into tokens:
	538
	539	@example
	540	int /* @r{keyword `int'} */
	541	square (int x) /* @r{identifier, open-paren, keyword `int',}
	542	@r{identifier, close-paren} */
	543	@{ /* @r{open-brace} */
	544	return x * x; /* @r{keyword `return', identifier, asterisk,}
	545	@r{identifier, semicolon} */
	546	@} /* @r{close-brace} */
	547	@end example
	548
	549	The syntactic groupings of C include the expression, the statement, the
	550	declaration, and the function definition. These are represented in the
	551	grammar of C by nonterminal symbols `expression', `statement',
	552	`declaration' and `function definition'. The full grammar uses dozens of
	553	additional language constructs, each with its own nonterminal symbol, in
	554	order to express the meanings of these four. The example above is a
	555	function definition; it contains one declaration, and one statement. In
	556	the statement, each @samp{x} is an expression and so is @samp{x * x}.
	557
	558	Each nonterminal symbol must have grammatical rules showing how it is made
	559	out of simpler constructs. For example, one kind of C statement is the
	560	@code{return} statement; this would be described with a grammar rule which
	561	reads informally as follows:
	562
	563	@quotation
	564	A `statement' can be made of a `return' keyword, an `expression' and a
	565	`semicolon'.
	566	@end quotation
	567
	568	@noindent
	569	There would be many other rules for `statement', one for each kind of
	570	statement in C.
	571
	572	@cindex start symbol
	573	One nonterminal symbol must be distinguished as the special one which
	574	defines a complete utterance in the language. It is called the @dfn{start
	575	symbol}. In a compiler, this means a complete input program. In the C
	576	language, the nonterminal symbol `sequence of definitions and declarations'
	577	plays this role.
	578
	579	For example, @samp{1 + 2} is a valid C expression---a valid part of a C
	580	program---but it is not valid as an @emph{entire} C program. In the
	581	context-free grammar of C, this follows from the fact that `expression' is
	582	not the start symbol.
	583
	584	The Bison parser reads a sequence of tokens as its input, and groups the
	585	tokens using the grammar rules. If the input is valid, the end result is
	586	that the entire token sequence reduces to a single grouping whose symbol is
	587	the grammar's start symbol. If we use a grammar for C, the entire input
	588	must be a `sequence of definitions and declarations'. If not, the parser
	589	reports a syntax error.
	590
	591	@node Grammar in Bison
	592	@section From Formal Rules to Bison Input
	593	@cindex Bison grammar
	594	@cindex grammar, Bison
	595	@cindex formal grammar
	596
	597	A formal grammar is a mathematical construct. To define the language
	598	for Bison, you must write a file expressing the grammar in Bison syntax:
	599	a @dfn{Bison grammar} file. @xref{Grammar File, ,Bison Grammar Files}.
	600
	601	A nonterminal symbol in the formal grammar is represented in Bison input
	602	as an identifier, like an identifier in C@. By convention, it should be
	603	in lower case, such as @code{expr}, @code{stmt} or @code{declaration}.
	604
	605	The Bison representation for a terminal symbol is also called a @dfn{token
	606	type}. Token types as well can be represented as C-like identifiers. By
	607	convention, these identifiers should be upper case to distinguish them from
	608	nonterminals: for example, @code{INTEGER}, @code{IDENTIFIER}, @code{IF} or
	609	@code{RETURN}. A terminal symbol that stands for a particular keyword in
	610	the language should be named after that keyword converted to upper case.
	611	The terminal symbol @code{error} is reserved for error recovery.
	612	@xref{Symbols}.
	613
	614	A terminal symbol can also be represented as a character literal, just like
	615	a C character constant. You should do this whenever a token is just a
	616	single character (parenthesis, plus-sign, etc.): use that same character in
	617	a literal as the terminal symbol for that token.
	618
	619	A third way to represent a terminal symbol is with a C string constant
	620	containing several characters. @xref{Symbols}, for more information.
	621
	622	The grammar rules also have an expression in Bison syntax. For example,
	623	here is the Bison rule for a C @code{return} statement. The semicolon in
	624	quotes is a literal character token, representing part of the C syntax for
	625	the statement; the naked semicolon, and the colon, are Bison punctuation
	626	used in every rule.
	627
	628	@example
	629	stmt: RETURN expr ';' ;
	630	@end example
	631
	632	@noindent
	633	@xref{Rules, ,Syntax of Grammar Rules}.
	634
	635	@node Semantic Values
	636	@section Semantic Values
	637	@cindex semantic value
	638	@cindex value, semantic
	639
	640	A formal grammar selects tokens only by their classifications: for example,
	641	if a rule mentions the terminal symbol `integer constant', it means that
	642	@emph{any} integer constant is grammatically valid in that position. The
	643	precise value of the constant is irrelevant to how to parse the input: if
	644	@samp{x+4} is grammatical then @samp{x+1} or @samp{x+3989} is equally
	645	grammatical.
	646
	647	But the precise value is very important for what the input means once it is
	648	parsed. A compiler is useless if it fails to distinguish between 4, 1 and
	649	3989 as constants in the program! Therefore, each token in a Bison grammar
	650	has both a token type and a @dfn{semantic value}. @xref{Semantics,
	651	,Defining Language Semantics},
	652	for details.
	653
	654	The token type is a terminal symbol defined in the grammar, such as
	655	@code{INTEGER}, @code{IDENTIFIER} or @code{','}. It tells everything
	656	you need to know to decide where the token may validly appear and how to
	657	group it with other tokens. The grammar rules know nothing about tokens
	658	except their types.
	659
	660	The semantic value has all the rest of the information about the
	661	meaning of the token, such as the value of an integer, or the name of an
	662	identifier. (A token such as @code{','} which is just punctuation doesn't
	663	need to have any semantic value.)
	664
	665	For example, an input token might be classified as token type
	666	@code{INTEGER} and have the semantic value 4. Another input token might
	667	have the same token type @code{INTEGER} but value 3989. When a grammar
	668	rule says that @code{INTEGER} is allowed, either of these tokens is
	669	acceptable because each is an @code{INTEGER}. When the parser accepts the
	670	token, it keeps track of the token's semantic value.
	671
	672	Each grouping can also have a semantic value as well as its nonterminal
	673	symbol. For example, in a calculator, an expression typically has a
	674	semantic value that is a number. In a compiler for a programming
	675	language, an expression typically has a semantic value that is a tree
	676	structure describing the meaning of the expression.
	677
	678	@node Semantic Actions
	679	@section Semantic Actions
	680	@cindex semantic actions
	681	@cindex actions, semantic
	682
	683	In order to be useful, a program must do more than parse input; it must
	684	also produce some output based on the input. In a Bison grammar, a grammar
	685	rule can have an @dfn{action} made up of C statements. Each time the
	686	parser recognizes a match for that rule, the action is executed.
	687	@xref{Actions}.
	688
	689	Most of the time, the purpose of an action is to compute the semantic value
	690	of the whole construct from the semantic values of its parts. For example,
	691	suppose we have a rule which says an expression can be the sum of two
	692	expressions. When the parser recognizes such a sum, each of the
	693	subexpressions has a semantic value which describes how it was built up.
	694	The action for this rule should create a similar sort of value for the
	695	newly recognized larger expression.
	696
	697	For example, here is a rule that says an expression can be the sum of
	698	two subexpressions:
	699
	700	@example
	701	expr: expr '+' expr @{ $$ = $1 + $3; @} ;
	702	@end example
	703
	704	@noindent
	705	The action says how to produce the semantic value of the sum expression
	706	from the values of the two subexpressions.
	707
	708	@node GLR Parsers
	709	@section Writing GLR Parsers
	710	@cindex GLR parsing
	711	@cindex generalized LR (GLR) parsing
	712	@findex %glr-parser
	713	@cindex conflicts
	714	@cindex shift/reduce conflicts
	715	@cindex reduce/reduce conflicts
	716
	717	In some grammars, Bison's deterministic
	718	LR(1) parsing algorithm cannot decide whether to apply a
	719	certain grammar rule at a given point. That is, it may not be able to
	720	decide (on the basis of the input read so far) which of two possible
	721	reductions (applications of a grammar rule) applies, or whether to apply
	722	a reduction or read more of the input and apply a reduction later in the
	723	input. These are known respectively as @dfn{reduce/reduce} conflicts
	724	(@pxref{Reduce/Reduce}), and @dfn{shift/reduce} conflicts
	725	(@pxref{Shift/Reduce}).
	726
	727	To use a grammar that is not easily modified to be LR(1), a
	728	more general parsing algorithm is sometimes necessary. If you include
	729	@code{%glr-parser} among the Bison declarations in your file
	730	(@pxref{Grammar Outline}), the result is a Generalized LR
	731	(GLR) parser. These parsers handle Bison grammars that
	732	contain no unresolved conflicts (i.e., after applying precedence
	733	declarations) identically to deterministic parsers. However, when
	734	faced with unresolved shift/reduce and reduce/reduce conflicts,
	735	GLR parsers use the simple expedient of doing both,
	736	effectively cloning the parser to follow both possibilities. Each of
	737	the resulting parsers can again split, so that at any given time, there
	738	can be any number of possible parses being explored. The parsers
	739	proceed in lockstep; that is, all of them consume (shift) a given input
	740	symbol before any of them proceed to the next. Each of the cloned
	741	parsers eventually meets one of two possible fates: either it runs into
	742	a parsing error, in which case it simply vanishes, or it merges with
	743	another parser, because the two of them have reduced the input to an
	744	identical set of symbols.
	745
	746	During the time that there are multiple parsers, semantic actions are
	747	recorded, but not performed. When a parser disappears, its recorded
	748	semantic actions disappear as well, and are never performed. When a
	749	reduction makes two parsers identical, causing them to merge, Bison
	750	records both sets of semantic actions. Whenever the last two parsers
	751	merge, reverting to the single-parser case, Bison resolves all the
	752	outstanding actions either by precedences given to the grammar rules
	753	involved, or by performing both actions, and then calling a designated
	754	user-defined function on the resulting values to produce an arbitrary
	755	merged result.
	756
	757	@menu
	758	* Simple GLR Parsers:: Using GLR parsers on unambiguous grammars.
	759	* Merging GLR Parses:: Using GLR parsers to resolve ambiguities.
	760	* GLR Semantic Actions:: Deferred semantic actions have special concerns.
	761	* Compiler Requirements:: GLR parsers require a modern C compiler.
	762	@end menu
	763
	764	@node Simple GLR Parsers
	765	@subsection Using GLR on Unambiguous Grammars
	766	@cindex GLR parsing, unambiguous grammars
	767	@cindex generalized LR (GLR) parsing, unambiguous grammars
	768	@findex %glr-parser
	769	@findex %expect-rr
	770	@cindex conflicts
	771	@cindex reduce/reduce conflicts
	772	@cindex shift/reduce conflicts
	773
	774	In the simplest cases, you can use the GLR algorithm
	775	to parse grammars that are unambiguous but fail to be LR(1).
	776	Such grammars typically require more than one symbol of lookahead.
	777
	778	Consider a problem that
	779	arises in the declaration of enumerated and subrange types in the
	780	programming language Pascal. Here are some examples:
	781
	782	@example
	783	type subrange = lo .. hi;
	784	type enum = (a, b, c);
	785	@end example
	786
	787	@noindent
	788	The original language standard allows only numeric
	789	literals and constant identifiers for the subrange bounds (@samp{lo}
	790	and @samp{hi}), but Extended Pascal (ISO/IEC
	791	10206) and many other
	792	Pascal implementations allow arbitrary expressions there. This gives
	793	rise to the following situation, containing a superfluous pair of
	794	parentheses:
	795
	796	@example
	797	type subrange = (a) .. b;
	798	@end example
	799
	800	@noindent
	801	Compare this to the following declaration of an enumerated
	802	type with only one value:
	803
	804	@example
	805	type enum = (a);
	806	@end example
	807
	808	@noindent
	809	(These declarations are contrived, but they are syntactically
	810	valid, and more-complicated cases can come up in practical programs.)
	811
	812	These two declarations look identical until the @samp{..} token.
	813	With normal LR(1) one-token lookahead it is not
	814	possible to decide between the two forms when the identifier
	815	@samp{a} is parsed. It is, however, desirable
	816	for a parser to decide this, since in the latter case
	817	@samp{a} must become a new identifier to represent the enumeration
	818	value, while in the former case @samp{a} must be evaluated with its
	819	current meaning, which may be a constant or even a function call.
	820
	821	You could parse @samp{(a)} as an ``unspecified identifier in parentheses'',
	822	to be resolved later, but this typically requires substantial
	823	contortions in both semantic actions and large parts of the
	824	grammar, where the parentheses are nested in the recursive rules for
	825	expressions.
	826
	827	You might think of using the lexer to distinguish between the two
	828	forms by returning different tokens for currently defined and
	829	undefined identifiers. But if these declarations occur in a local
	830	scope, and @samp{a} is defined in an outer scope, then both forms
	831	are possible---either locally redefining @samp{a}, or using the
	832	value of @samp{a} from the outer scope. So this approach cannot
	833	work.
	834
	835	A simple solution to this problem is to declare the parser to
	836	use the GLR algorithm.
	837	When the GLR parser reaches the critical state, it
	838	merely splits into two branches and pursues both syntax rules
	839	simultaneously. Sooner or later, one of them runs into a parsing
	840	error. If there is a @samp{..} token before the next
	841	@samp{;}, the rule for enumerated types fails since it cannot
	842	accept @samp{..} anywhere; otherwise, the subrange type rule
	843	fails since it requires a @samp{..} token. So one of the branches
	844	fails silently, and the other one continues normally, performing
	845	all the intermediate actions that were postponed during the split.
	846
	847	If the input is syntactically incorrect, both branches fail and the parser
	848	reports a syntax error as usual.
	849
	850	The effect of all this is that the parser seems to ``guess'' the
	851	correct branch to take, or in other words, it seems to use more
	852	lookahead than the underlying LR(1) algorithm actually allows
	853	for. In this example, LR(2) would suffice, but also some cases
	854	that are not LR(@math{k}) for any @math{k} can be handled this way.
	855
	856	In general, a GLR parser can take quadratic or cubic worst-case time,
	857	and the current Bison parser even takes exponential time and space
	858	for some grammars. In practice, this rarely happens, and for many
	859	grammars it is possible to prove that it cannot happen.
	860	The present example contains only one conflict between two
	861	rules, and the type-declaration context containing the conflict
	862	cannot be nested. So the number of
	863	branches that can exist at any time is limited by the constant 2,
	864	and the parsing time is still linear.
	865
	866	Here is a Bison grammar corresponding to the example above. It
	867	parses a vastly simplified form of Pascal type declarations.
	868
	869	@example
	870	%token TYPE DOTDOT ID
	871
	872	@group
	873	%left '+' '-'
	874	%left '*' '/'
	875	@end group
	876
	877	%%
	878
	879	@group
	880	type_decl: TYPE ID '=' type ';' ;
	881	@end group
	882
	883	@group
	884	type:
	885	'(' id_list ')'
	886	\| expr DOTDOT expr
	887	;
	888	@end group
	889
	890	@group
	891	id_list:
	892	ID
	893	\| id_list ',' ID
	894	;
	895	@end group
	896
	897	@group
	898	expr:
	899	'(' expr ')'
	900	\| expr '+' expr
	901	\| expr '-' expr
	902	\| expr '*' expr
	903	\| expr '/' expr
	904	\| ID
	905	;
	906	@end group
	907	@end example
	908
	909	When used as a normal LR(1) grammar, Bison correctly complains
	910	about one reduce/reduce conflict. In the conflicting situation the
	911	parser chooses one of the alternatives, arbitrarily the one
	912	declared first. Therefore the following correct input is not
	913	recognized:
	914
	915	@example
	916	type t = (a) .. b;
	917	@end example
	918
	919	The parser can be turned into a GLR parser, while also telling Bison
	920	to be silent about the one known reduce/reduce conflict, by adding
	921	these two declarations to the Bison grammar file (before the first
	922	@samp{%%}):
	923
	924	@example
	925	%glr-parser
	926	%expect-rr 1
	927	@end example
	928
	929	@noindent
	930	No change in the grammar itself is required. Now the
	931	parser recognizes all valid declarations, according to the
	932	limited syntax above, transparently. In fact, the user does not even
	933	notice when the parser splits.
	934
	935	So here we have a case where we can use the benefits of GLR,
	936	almost without disadvantages. Even in simple cases like this, however,
	937	there are at least two potential problems to beware. First, always
	938	analyze the conflicts reported by Bison to make sure that GLR
	939	splitting is only done where it is intended. A GLR parser
	940	splitting inadvertently may cause problems less obvious than an
	941	LR parser statically choosing the wrong alternative in a
	942	conflict. Second, consider interactions with the lexer (@pxref{Semantic
	943	Tokens}) with great care. Since a split parser consumes tokens without
	944	performing any actions during the split, the lexer cannot obtain
	945	information via parser actions. Some cases of lexer interactions can be
	946	eliminated by using GLR to shift the complications from the
	947	lexer to the parser. You must check the remaining cases for
	948	correctness.
	949
	950	In our example, it would be safe for the lexer to return tokens based on
	951	their current meanings in some symbol table, because no new symbols are
	952	defined in the middle of a type declaration. Though it is possible for
	953	a parser to define the enumeration constants as they are parsed, before
	954	the type declaration is completed, it actually makes no difference since
	955	they cannot be used within the same enumerated type declaration.
	956
	957	@node Merging GLR Parses
	958	@subsection Using GLR to Resolve Ambiguities
	959	@cindex GLR parsing, ambiguous grammars
	960	@cindex generalized LR (GLR) parsing, ambiguous grammars
	961	@findex %dprec
	962	@findex %merge
	963	@cindex conflicts
	964	@cindex reduce/reduce conflicts
	965
	966	Let's consider an example, vastly simplified from a C++ grammar.
	967
	968	@example
	969	%@{
	970	#include <stdio.h>
	971	#define YYSTYPE char const *
	972	int yylex (void);
	973	void yyerror (char const *);
	974	%@}
	975
	976	%token TYPENAME ID
	977
	978	%right '='
	979	%left '+'
	980
	981	%glr-parser
	982
	983	%%
	984
	985	prog:
	986	/* Nothing. */
	987	\| prog stmt @{ printf ("\n"); @}
	988	;
	989
	990	stmt:
	991	expr ';' %dprec 1
	992	\| decl %dprec 2
	993	;
	994
	995	expr:
	996	ID @{ printf ("%s ", $$); @}
	997	\| TYPENAME '(' expr ')'
	998	@{ printf ("%s <cast> ", $1); @}
	999	\| expr '+' expr @{ printf ("+ "); @}
	1000	\| expr '=' expr @{ printf ("= "); @}
	1001	;
	1002
	1003	decl:
	1004	TYPENAME declarator ';'
	1005	@{ printf ("%s <declare> ", $1); @}
	1006	\| TYPENAME declarator '=' expr ';'
	1007	@{ printf ("%s <init-declare> ", $1); @}
	1008	;
	1009
	1010	declarator:
	1011	ID @{ printf ("\"%s\" ", $1); @}
	1012	\| '(' declarator ')'
	1013	;
	1014	@end example
	1015
	1016	@noindent
	1017	This models a problematic part of the C++ grammar---the ambiguity between
	1018	certain declarations and statements. For example,
	1019
	1020	@example
	1021	T (x) = y+z;
	1022	@end example
	1023
	1024	@noindent
	1025	parses as either an @code{expr} or a @code{stmt}
	1026	(assuming that @samp{T} is recognized as a @code{TYPENAME} and
	1027	@samp{x} as an @code{ID}).
	1028	Bison detects this as a reduce/reduce conflict between the rules
	1029	@code{expr : ID} and @code{declarator : ID}, which it cannot resolve at the
	1030	time it encounters @code{x} in the example above. Since this is a
	1031	GLR parser, it therefore splits the problem into two parses, one for
	1032	each choice of resolving the reduce/reduce conflict.
	1033	Unlike the example from the previous section (@pxref{Simple GLR Parsers}),
	1034	however, neither of these parses ``dies,'' because the grammar as it stands is
	1035	ambiguous. One of the parsers eventually reduces @code{stmt : expr ';'} and
	1036	the other reduces @code{stmt : decl}, after which both parsers are in an
	1037	identical state: they've seen @samp{prog stmt} and have the same unprocessed
	1038	input remaining. We say that these parses have @dfn{merged.}
	1039
	1040	At this point, the GLR parser requires a specification in the
	1041	grammar of how to choose between the competing parses.
	1042	In the example above, the two @code{%dprec}
	1043	declarations specify that Bison is to give precedence
	1044	to the parse that interprets the example as a
	1045	@code{decl}, which implies that @code{x} is a declarator.
	1046	The parser therefore prints
	1047
	1048	@example
	1049	"x" y z + T <init-declare>
	1050	@end example
	1051
	1052	The @code{%dprec} declarations only come into play when more than one
	1053	parse survives. Consider a different input string for this parser:
	1054
	1055	@example
	1056	T (x) + y;
	1057	@end example
	1058
	1059	@noindent
	1060	This is another example of using GLR to parse an unambiguous
	1061	construct, as shown in the previous section (@pxref{Simple GLR Parsers}).
	1062	Here, there is no ambiguity (this cannot be parsed as a declaration).
	1063	However, at the time the Bison parser encounters @code{x}, it does not
	1064	have enough information to resolve the reduce/reduce conflict (again,
	1065	between @code{x} as an @code{expr} or a @code{declarator}). In this
	1066	case, no precedence declaration is used. Again, the parser splits
	1067	into two, one assuming that @code{x} is an @code{expr}, and the other
	1068	assuming @code{x} is a @code{declarator}. The second of these parsers
	1069	then vanishes when it sees @code{+}, and the parser prints
	1070
	1071	@example
	1072	x T <cast> y +
	1073	@end example
	1074
	1075	Suppose that instead of resolving the ambiguity, you wanted to see all
	1076	the possibilities. For this purpose, you must merge the semantic
	1077	actions of the two possible parsers, rather than choosing one over the
	1078	other. To do so, you could change the declaration of @code{stmt} as
	1079	follows:
	1080
	1081	@example
	1082	stmt:
	1083	expr ';' %merge <stmtMerge>
	1084	\| decl %merge <stmtMerge>
	1085	;
	1086	@end example
	1087
	1088	@noindent
	1089	and define the @code{stmtMerge} function as:
	1090
	1091	@example
	1092	static YYSTYPE
	1093	stmtMerge (YYSTYPE x0, YYSTYPE x1)
	1094	@{
	1095	printf ("<OR> ");
	1096	return "";
	1097	@}
	1098	@end example
	1099
	1100	@noindent
	1101	with an accompanying forward declaration
	1102	in the C declarations at the beginning of the file:
	1103
	1104	@example
	1105	%@{
	1106	#define YYSTYPE char const *
	1107	static YYSTYPE stmtMerge (YYSTYPE x0, YYSTYPE x1);
	1108	%@}
	1109	@end example
	1110
	1111	@noindent
	1112	With these declarations, the resulting parser parses the first example
	1113	as both an @code{expr} and a @code{decl}, and prints
	1114
	1115	@example
	1116	"x" y z + T <init-declare> x T <cast> y z + = <OR>
	1117	@end example
	1118
	1119	Bison requires that all of the
	1120	productions that participate in any particular merge have identical
	1121	@samp{%merge} clauses. Otherwise, the ambiguity would be unresolvable,
	1122	and the parser will report an error during any parse that results in
	1123	the offending merge.
	1124
	1125	@node GLR Semantic Actions
	1126	@subsection GLR Semantic Actions
	1127
	1128	@cindex deferred semantic actions
	1129	By definition, a deferred semantic action is not performed at the same time as
	1130	the associated reduction.
	1131	This raises caveats for several Bison features you might use in a semantic
	1132	action in a GLR parser.
	1133
	1134	@vindex yychar
	1135	@cindex GLR parsers and @code{yychar}
	1136	@vindex yylval
	1137	@cindex GLR parsers and @code{yylval}
	1138	@vindex yylloc
	1139	@cindex GLR parsers and @code{yylloc}
	1140	In any semantic action, you can examine @code{yychar} to determine the type of
	1141	the lookahead token present at the time of the associated reduction.
	1142	After checking that @code{yychar} is not set to @code{YYEMPTY} or @code{YYEOF},
	1143	you can then examine @code{yylval} and @code{yylloc} to determine the
	1144	lookahead token's semantic value and location, if any.
	1145	In a nondeferred semantic action, you can also modify any of these variables to
	1146	influence syntax analysis.
	1147	@xref{Lookahead, ,Lookahead Tokens}.
	1148
	1149	@findex yyclearin
	1150	@cindex GLR parsers and @code{yyclearin}
	1151	In a deferred semantic action, it's too late to influence syntax analysis.
	1152	In this case, @code{yychar}, @code{yylval}, and @code{yylloc} are set to
	1153	shallow copies of the values they had at the time of the associated reduction.
	1154	For this reason alone, modifying them is dangerous.
	1155	Moreover, the result of modifying them is undefined and subject to change with
	1156	future versions of Bison.
	1157	For example, if a semantic action might be deferred, you should never write it
	1158	to invoke @code{yyclearin} (@pxref{Action Features}) or to attempt to free
	1159	memory referenced by @code{yylval}.
	1160
	1161	@findex YYERROR
	1162	@cindex GLR parsers and @code{YYERROR}
	1163	Another Bison feature requiring special consideration is @code{YYERROR}
	1164	(@pxref{Action Features}), which you can invoke in a semantic action to
	1165	initiate error recovery.
	1166	During deterministic GLR operation, the effect of @code{YYERROR} is
	1167	the same as its effect in a deterministic parser.
	1168	In a deferred semantic action, its effect is undefined.
	1169	@c The effect is probably a syntax error at the split point.
	1170
	1171	Also, see @ref{Location Default Action, ,Default Action for Locations}, which
	1172	describes a special usage of @code{YYLLOC_DEFAULT} in GLR parsers.
	1173
	1174	@node Compiler Requirements
	1175	@subsection Considerations when Compiling GLR Parsers
	1176	@cindex @code{inline}
	1177	@cindex GLR parsers and @code{inline}
	1178
	1179	The GLR parsers require a compiler for ISO C89 or
	1180	later. In addition, they use the @code{inline} keyword, which is not
	1181	C89, but is C99 and is a common extension in pre-C99 compilers. It is
	1182	up to the user of these parsers to handle
	1183	portability issues. For instance, if using Autoconf and the Autoconf
	1184	macro @code{AC_C_INLINE}, a mere
	1185
	1186	@example
	1187	%@{
	1188	#include <config.h>
	1189	%@}
	1190	@end example
	1191
	1192	@noindent
	1193	will suffice. Otherwise, we suggest
	1194
	1195	@example
	1196	%@{
	1197	#if (__STDC_VERSION__ < 199901 && ! defined __GNUC__ \
	1198	&& ! defined inline)
	1199	# define inline
	1200	#endif
	1201	%@}
	1202	@end example
	1203
	1204	@node Locations
	1205	@section Locations
	1206	@cindex location
	1207	@cindex textual location
	1208	@cindex location, textual
	1209
	1210	Many applications, like interpreters or compilers, have to produce verbose
	1211	and useful error messages. To achieve this, one must be able to keep track of
	1212	the @dfn{textual location}, or @dfn{location}, of each syntactic construct.
	1213	Bison provides a mechanism for handling these locations.
	1214
	1215	Each token has a semantic value. In a similar fashion, each token has an
	1216	associated location, but the type of locations is the same for all tokens
	1217	and groupings. Moreover, the output parser is equipped with a default data
	1218	structure for storing locations (@pxref{Tracking Locations}, for more
	1219	details).
	1220
	1221	Like semantic values, locations can be reached in actions using a dedicated
	1222	set of constructs. In the example above, the location of the whole grouping
	1223	is @code{@@$}, while the locations of the subexpressions are @code{@@1} and
	1224	@code{@@3}.
	1225
	1226	When a rule is matched, a default action is used to compute the semantic value
	1227	of its left hand side (@pxref{Actions}). In the same way, another default
	1228	action is used for locations. However, the action for locations is general
	1229	enough for most cases, meaning there is usually no need to describe for each
	1230	rule how @code{@@$} should be formed. When building a new location for a given
	1231	grouping, the default behavior of the output parser is to take the beginning
	1232	of the first symbol, and the end of the last symbol.
	1233
	1234	@node Bison Parser
	1235	@section Bison Output: the Parser Implementation File
	1236	@cindex Bison parser
	1237	@cindex Bison utility
	1238	@cindex lexical analyzer, purpose
	1239	@cindex parser
	1240
	1241	When you run Bison, you give it a Bison grammar file as input. The
	1242	most important output is a C source file that implements a parser for
	1243	the language described by the grammar. This parser is called a
	1244	@dfn{Bison parser}, and this file is called a @dfn{Bison parser
	1245	implementation file}. Keep in mind that the Bison utility and the
	1246	Bison parser are two distinct programs: the Bison utility is a program
	1247	whose output is the Bison parser implementation file that becomes part
	1248	of your program.
	1249
	1250	The job of the Bison parser is to group tokens into groupings according to
	1251	the grammar rules---for example, to build identifiers and operators into
	1252	expressions. As it does this, it runs the actions for the grammar rules it
	1253	uses.
	1254
	1255	The tokens come from a function called the @dfn{lexical analyzer} that
	1256	you must supply in some fashion (such as by writing it in C). The Bison
	1257	parser calls the lexical analyzer each time it wants a new token. It
	1258	doesn't know what is ``inside'' the tokens (though their semantic values
	1259	may reflect this). Typically the lexical analyzer makes the tokens by
	1260	parsing characters of text, but Bison does not depend on this.
	1261	@xref{Lexical, ,The Lexical Analyzer Function @code{yylex}}.
	1262
	1263	The Bison parser implementation file is C code which defines a
	1264	function named @code{yyparse} which implements that grammar. This
	1265	function does not make a complete C program: you must supply some
	1266	additional functions. One is the lexical analyzer. Another is an
	1267	error-reporting function which the parser calls to report an error.
	1268	In addition, a complete C program must start with a function called
	1269	@code{main}; you have to provide this, and arrange for it to call
	1270	@code{yyparse} or the parser will never run. @xref{Interface, ,Parser
	1271	C-Language Interface}.
	1272
	1273	Aside from the token type names and the symbols in the actions you
	1274	write, all symbols defined in the Bison parser implementation file
	1275	itself begin with @samp{yy} or @samp{YY}. This includes interface
	1276	functions such as the lexical analyzer function @code{yylex}, the
	1277	error reporting function @code{yyerror} and the parser function
	1278	@code{yyparse} itself. This also includes numerous identifiers used
	1279	for internal purposes. Therefore, you should avoid using C
	1280	identifiers starting with @samp{yy} or @samp{YY} in the Bison grammar
	1281	file except for the ones defined in this manual. Also, you should
	1282	avoid using the C identifiers @samp{malloc} and @samp{free} for
	1283	anything other than their usual meanings.
	1284
	1285	In some cases the Bison parser implementation file includes system
	1286	headers, and in those cases your code should respect the identifiers
	1287	reserved by those headers. On some non-GNU hosts, @code{<alloca.h>},
	1288	@code{<malloc.h>}, @code{<stddef.h>}, and @code{<stdlib.h>} are
	1289	included as needed to declare memory allocators and related types.
	1290	@code{<libintl.h>} is included if message translation is in use
	1291	(@pxref{Internationalization}). Other system headers may be included
	1292	if you define @code{YYDEBUG} to a nonzero value (@pxref{Tracing,
	1293	,Tracing Your Parser}).
	1294
	1295	@node Stages
	1296	@section Stages in Using Bison
	1297	@cindex stages in using Bison
	1298	@cindex using Bison
	1299
	1300	The actual language-design process using Bison, from grammar specification
	1301	to a working compiler or interpreter, has these parts:
	1302
	1303	@enumerate
	1304	@item
	1305	Formally specify the grammar in a form recognized by Bison
	1306	(@pxref{Grammar File, ,Bison Grammar Files}). For each grammatical rule
	1307	in the language, describe the action that is to be taken when an
	1308	instance of that rule is recognized. The action is described by a
	1309	sequence of C statements.
	1310
	1311	@item
	1312	Write a lexical analyzer to process input and pass tokens to the parser.
	1313	The lexical analyzer may be written by hand in C (@pxref{Lexical, ,The
	1314	Lexical Analyzer Function @code{yylex}}). It could also be produced
	1315	using Lex, but the use of Lex is not discussed in this manual.
	1316
	1317	@item
	1318	Write a controlling function that calls the Bison-produced parser.
	1319
	1320	@item
	1321	Write error-reporting routines.
	1322	@end enumerate
	1323
	1324	To turn this source code as written into a runnable program, you
	1325	must follow these steps:
	1326
	1327	@enumerate
	1328	@item
	1329	Run Bison on the grammar to produce the parser.
	1330
	1331	@item
	1332	Compile the code output by Bison, as well as any other source files.
	1333
	1334	@item
	1335	Link the object files to produce the finished product.
	1336	@end enumerate
	1337
	1338	@node Grammar Layout
	1339	@section The Overall Layout of a Bison Grammar
	1340	@cindex grammar file
	1341	@cindex file format
	1342	@cindex format of grammar file
	1343	@cindex layout of Bison grammar
	1344
	1345	The input file for the Bison utility is a @dfn{Bison grammar file}. The
	1346	general form of a Bison grammar file is as follows:
	1347
	1348	@example
	1349	%@{
	1350	@var{Prologue}
	1351	%@}
	1352
	1353	@var{Bison declarations}
	1354
	1355	%%
	1356	@var{Grammar rules}
	1357	%%
	1358	@var{Epilogue}
	1359	@end example
	1360
	1361	@noindent
	1362	The @samp{%%}, @samp{%@{} and @samp{%@}} are punctuation that appears
	1363	in every Bison grammar file to separate the sections.
	1364
	1365	The prologue may define types and variables used in the actions. You can
	1366	also use preprocessor commands to define macros used there, and use
	1367	@code{#include} to include header files that do any of these things.
	1368	You need to declare the lexical analyzer @code{yylex} and the error
	1369	printer @code{yyerror} here, along with any other global identifiers
	1370	used by the actions in the grammar rules.
	1371
	1372	The Bison declarations declare the names of the terminal and nonterminal
	1373	symbols, and may also describe operator precedence and the data types of
	1374	semantic values of various symbols.
	1375
	1376	The grammar rules define how to construct each nonterminal symbol from its
	1377	parts.
	1378
	1379	The epilogue can contain any code you want to use. Often the
	1380	definitions of functions declared in the prologue go here. In a
	1381	simple program, all the rest of the program can go here.
	1382
	1383	@node Examples
	1384	@chapter Examples
	1385	@cindex simple examples
	1386	@cindex examples, simple
	1387
	1388	Now we show and explain several sample programs written using Bison: a
	1389	reverse polish notation calculator, an algebraic (infix) notation
	1390	calculator --- later extended to track ``locations'' ---
	1391	and a multi-function calculator. All
	1392	produce usable, though limited, interactive desk-top calculators.
	1393
	1394	These examples are simple, but Bison grammars for real programming
	1395	languages are written the same way. You can copy these examples into a
	1396	source file to try them.
	1397
	1398	@menu
	1399	* RPN Calc:: Reverse polish notation calculator;
	1400	a first example with no operator precedence.
	1401	* Infix Calc:: Infix (algebraic) notation calculator.
	1402	Operator precedence is introduced.
	1403	* Simple Error Recovery:: Continuing after syntax errors.
	1404	* Location Tracking Calc:: Demonstrating the use of @@@var{n} and @@$.
	1405	* Multi-function Calc:: Calculator with memory and trig functions.
	1406	It uses multiple data-types for semantic values.
	1407	* Exercises:: Ideas for improving the multi-function calculator.
	1408	@end menu
	1409
	1410	@node RPN Calc
	1411	@section Reverse Polish Notation Calculator
	1412	@cindex reverse polish notation
	1413	@cindex polish notation calculator
	1414	@cindex @code{rpcalc}
	1415	@cindex calculator, simple
	1416
	1417	The first example is that of a simple double-precision @dfn{reverse polish
	1418	notation} calculator (a calculator using postfix operators). This example
	1419	provides a good starting point, since operator precedence is not an issue.
	1420	The second example will illustrate how operator precedence is handled.
	1421
	1422	The source code for this calculator is named @file{rpcalc.y}. The
	1423	@samp{.y} extension is a convention used for Bison grammar files.
	1424
	1425	@menu
	1426	* Rpcalc Declarations:: Prologue (declarations) for rpcalc.
	1427	* Rpcalc Rules:: Grammar Rules for rpcalc, with explanation.
	1428	* Rpcalc Lexer:: The lexical analyzer.
	1429	* Rpcalc Main:: The controlling function.
	1430	* Rpcalc Error:: The error reporting function.
	1431	* Rpcalc Generate:: Running Bison on the grammar file.
	1432	* Rpcalc Compile:: Run the C compiler on the output code.
	1433	@end menu
	1434
	1435	@node Rpcalc Declarations
	1436	@subsection Declarations for @code{rpcalc}
	1437
	1438	Here are the C and Bison declarations for the reverse polish notation
	1439	calculator. As in C, comments are placed between @samp{/@dots{}/}.
	1440
	1441	@example
	1442	/* Reverse polish notation calculator. */
	1443
	1444	%@{
	1445	#define YYSTYPE double
	1446	#include <math.h>
	1447	int yylex (void);
	1448	void yyerror (char const *);
	1449	%@}
	1450
	1451	%token NUM
	1452
	1453	%% /* Grammar rules and actions follow. */
	1454	@end example
	1455
	1456	The declarations section (@pxref{Prologue, , The prologue}) contains two
	1457	preprocessor directives and two forward declarations.
	1458
	1459	The @code{#define} directive defines the macro @code{YYSTYPE}, thus
	1460	specifying the C data type for semantic values of both tokens and
	1461	groupings (@pxref{Value Type, ,Data Types of Semantic Values}). The
	1462	Bison parser will use whatever type @code{YYSTYPE} is defined as; if you
	1463	don't define it, @code{int} is the default. Because we specify
	1464	@code{double}, each token and each expression has an associated value,
	1465	which is a floating point number.
	1466
	1467	The @code{#include} directive is used to declare the exponentiation
	1468	function @code{pow}.
	1469
	1470	The forward declarations for @code{yylex} and @code{yyerror} are
	1471	needed because the C language requires that functions be declared
	1472	before they are used. These functions will be defined in the
	1473	epilogue, but the parser calls them so they must be declared in the
	1474	prologue.
	1475
	1476	The second section, Bison declarations, provides information to Bison
	1477	about the token types (@pxref{Bison Declarations, ,The Bison
	1478	Declarations Section}). Each terminal symbol that is not a
	1479	single-character literal must be declared here. (Single-character
	1480	literals normally don't need to be declared.) In this example, all the
	1481	arithmetic operators are designated by single-character literals, so the
	1482	only terminal symbol that needs to be declared is @code{NUM}, the token
	1483	type for numeric constants.
	1484
	1485	@node Rpcalc Rules
	1486	@subsection Grammar Rules for @code{rpcalc}
	1487
	1488	Here are the grammar rules for the reverse polish notation calculator.
	1489
	1490	@example
	1491	@group
	1492	input:
	1493	/* empty */
	1494	\| input line
	1495	;
	1496	@end group
	1497
	1498	@group
	1499	line:
	1500	'\n'
	1501	\| exp '\n' @{ printf ("%.10g\n", $1); @}
	1502	;
	1503	@end group
	1504
	1505	@group
	1506	exp:
	1507	NUM @{ $$ = $1; @}
	1508	\| exp exp '+' @{ $$ = $1 + $2; @}
	1509	\| exp exp '-' @{ $$ = $1 - $2; @}
	1510	\| exp exp '' @{ $$ = $1 $2; @}
	1511	\| exp exp '/' @{ $$ = $1 / $2; @}
	1512	\| exp exp '^' @{ $$ = pow ($1, $2); @} /* Exponentiation */
	1513	\| exp 'n' @{ $$ = -$1; @} /* Unary minus */
	1514	;
	1515	@end group
	1516	%%
	1517	@end example
	1518
	1519	The groupings of the rpcalc ``language'' defined here are the expression
	1520	(given the name @code{exp}), the line of input (@code{line}), and the
	1521	complete input transcript (@code{input}). Each of these nonterminal
	1522	symbols has several alternate rules, joined by the vertical bar @samp{\|}
	1523	which is read as ``or''. The following sections explain what these rules
	1524	mean.
	1525
	1526	The semantics of the language is determined by the actions taken when a
	1527	grouping is recognized. The actions are the C code that appears inside
	1528	braces. @xref{Actions}.
	1529
	1530	You must specify these actions in C, but Bison provides the means for
	1531	passing semantic values between the rules. In each action, the
	1532	pseudo-variable @code{$$} stands for the semantic value for the grouping
	1533	that the rule is going to construct. Assigning a value to @code{$$} is the
	1534	main job of most actions. The semantic values of the components of the
	1535	rule are referred to as @code{$1}, @code{$2}, and so on.
	1536
	1537	@menu
	1538	* Rpcalc Input::
	1539	* Rpcalc Line::
	1540	* Rpcalc Expr::
	1541	@end menu
	1542
	1543	@node Rpcalc Input
	1544	@subsubsection Explanation of @code{input}
	1545
	1546	Consider the definition of @code{input}:
	1547
	1548	@example
	1549	input:
	1550	/* empty */
	1551	\| input line
	1552	;
	1553	@end example
	1554
	1555	This definition reads as follows: ``A complete input is either an empty
	1556	string, or a complete input followed by an input line''. Notice that
	1557	``complete input'' is defined in terms of itself. This definition is said
	1558	to be @dfn{left recursive} since @code{input} appears always as the
	1559	leftmost symbol in the sequence. @xref{Recursion, ,Recursive Rules}.
	1560
	1561	The first alternative is empty because there are no symbols between the
	1562	colon and the first @samp{\|}; this means that @code{input} can match an
	1563	empty string of input (no tokens). We write the rules this way because it
	1564	is legitimate to type @kbd{Ctrl-d} right after you start the calculator.
	1565	It's conventional to put an empty alternative first and write the comment
	1566	@samp{/* empty */} in it.
	1567
	1568	The second alternate rule (@code{input line}) handles all nontrivial input.
	1569	It means, ``After reading any number of lines, read one more line if
	1570	possible.'' The left recursion makes this rule into a loop. Since the
	1571	first alternative matches empty input, the loop can be executed zero or
	1572	more times.
	1573
	1574	The parser function @code{yyparse} continues to process input until a
	1575	grammatical error is seen or the lexical analyzer says there are no more
	1576	input tokens; we will arrange for the latter to happen at end-of-input.
	1577
	1578	@node Rpcalc Line
	1579	@subsubsection Explanation of @code{line}
	1580
	1581	Now consider the definition of @code{line}:
	1582
	1583	@example
	1584	line:
	1585	'\n'
	1586	\| exp '\n' @{ printf ("%.10g\n", $1); @}
	1587	;
	1588	@end example
	1589
	1590	The first alternative is a token which is a newline character; this means
	1591	that rpcalc accepts a blank line (and ignores it, since there is no
	1592	action). The second alternative is an expression followed by a newline.
	1593	This is the alternative that makes rpcalc useful. The semantic value of
	1594	the @code{exp} grouping is the value of @code{$1} because the @code{exp} in
	1595	question is the first symbol in the alternative. The action prints this
	1596	value, which is the result of the computation the user asked for.
	1597
	1598	This action is unusual because it does not assign a value to @code{$$}. As
	1599	a consequence, the semantic value associated with the @code{line} is
	1600	uninitialized (its value will be unpredictable). This would be a bug if
	1601	that value were ever used, but we don't use it: once rpcalc has printed the
	1602	value of the user's input line, that value is no longer needed.
	1603
	1604	@node Rpcalc Expr
	1605	@subsubsection Explanation of @code{expr}
	1606
	1607	The @code{exp} grouping has several rules, one for each kind of expression.
	1608	The first rule handles the simplest expressions: those that are just numbers.
	1609	The second handles an addition-expression, which looks like two expressions
	1610	followed by a plus-sign. The third handles subtraction, and so on.
	1611
	1612	@example
	1613	exp:
	1614	NUM
	1615	\| exp exp '+' @{ $$ = $1 + $2; @}
	1616	\| exp exp '-' @{ $$ = $1 - $2; @}
	1617	@dots{}
	1618	;
	1619	@end example
	1620
	1621	We have used @samp{\|} to join all the rules for @code{exp}, but we could
	1622	equally well have written them separately:
	1623
	1624	@example
	1625	exp: NUM ;
	1626	exp: exp exp '+' @{ $$ = $1 + $2; @};
	1627	exp: exp exp '-' @{ $$ = $1 - $2; @};
	1628	@dots{}
	1629	@end example
	1630
	1631	Most of the rules have actions that compute the value of the expression in
	1632	terms of the value of its parts. For example, in the rule for addition,
	1633	@code{$1} refers to the first component @code{exp} and @code{$2} refers to
	1634	the second one. The third component, @code{'+'}, has no meaningful
	1635	associated semantic value, but if it had one you could refer to it as
	1636	@code{$3}. When @code{yyparse} recognizes a sum expression using this
	1637	rule, the sum of the two subexpressions' values is produced as the value of
	1638	the entire expression. @xref{Actions}.
	1639
	1640	You don't have to give an action for every rule. When a rule has no
	1641	action, Bison by default copies the value of @code{$1} into @code{$$}.
	1642	This is what happens in the first rule (the one that uses @code{NUM}).
	1643
	1644	The formatting shown here is the recommended convention, but Bison does
	1645	not require it. You can add or change white space as much as you wish.
	1646	For example, this:
	1647
	1648	@example
	1649	exp: NUM \| exp exp '+' @{$$ = $1 + $2; @} \| @dots{} ;
	1650	@end example
	1651
	1652	@noindent
	1653	means the same thing as this:
	1654
	1655	@example
	1656	exp:
	1657	NUM
	1658	\| exp exp '+' @{ $$ = $1 + $2; @}
	1659	\| @dots{}
	1660	;
	1661	@end example
	1662
	1663	@noindent
	1664	The latter, however, is much more readable.
	1665
	1666	@node Rpcalc Lexer
	1667	@subsection The @code{rpcalc} Lexical Analyzer
	1668	@cindex writing a lexical analyzer
	1669	@cindex lexical analyzer, writing
	1670
	1671	The lexical analyzer's job is low-level parsing: converting characters
	1672	or sequences of characters into tokens. The Bison parser gets its
	1673	tokens by calling the lexical analyzer. @xref{Lexical, ,The Lexical
	1674	Analyzer Function @code{yylex}}.
	1675
	1676	Only a simple lexical analyzer is needed for the RPN
	1677	calculator. This
	1678	lexical analyzer skips blanks and tabs, then reads in numbers as
	1679	@code{double} and returns them as @code{NUM} tokens. Any other character
	1680	that isn't part of a number is a separate token. Note that the token-code
	1681	for such a single-character token is the character itself.
	1682
	1683	The return value of the lexical analyzer function is a numeric code which
	1684	represents a token type. The same text used in Bison rules to stand for
	1685	this token type is also a C expression for the numeric code for the type.
	1686	This works in two ways. If the token type is a character literal, then its
	1687	numeric code is that of the character; you can use the same
	1688	character literal in the lexical analyzer to express the number. If the
	1689	token type is an identifier, that identifier is defined by Bison as a C
	1690	macro whose definition is the appropriate number. In this example,
	1691	therefore, @code{NUM} becomes a macro for @code{yylex} to use.
	1692
	1693	The semantic value of the token (if it has one) is stored into the
	1694	global variable @code{yylval}, which is where the Bison parser will look
	1695	for it. (The C data type of @code{yylval} is @code{YYSTYPE}, which was
	1696	defined at the beginning of the grammar; @pxref{Rpcalc Declarations,
	1697	,Declarations for @code{rpcalc}}.)
	1698
	1699	A token type code of zero is returned if the end-of-input is encountered.
	1700	(Bison recognizes any nonpositive value as indicating end-of-input.)
	1701
	1702	Here is the code for the lexical analyzer:
	1703
	1704	@example
	1705	@group
	1706	/* The lexical analyzer returns a double floating point
	1707	number on the stack and the token NUM, or the numeric code
	1708	of the character read if not a number. It skips all blanks
	1709	and tabs, and returns 0 for end-of-input. */
	1710
	1711	#include <ctype.h>
	1712	@end group
	1713
	1714	@group
	1715	int
	1716	yylex (void)
	1717	@{
	1718	int c;
	1719
	1720	/* Skip white space. */
	1721	while ((c = getchar ()) == ' ' \|\| c == '\t')
	1722	continue;
	1723	@end group
	1724	@group
	1725	/* Process numbers. */
	1726	if (c == '.' \|\| isdigit (c))
	1727	@{
	1728	ungetc (c, stdin);
	1729	scanf ("%lf", &yylval);
	1730	return NUM;
	1731	@}
	1732	@end group
	1733	@group
	1734	/* Return end-of-input. */
	1735	if (c == EOF)
	1736	return 0;
	1737	/* Return a single char. */
	1738	return c;
	1739	@}
	1740	@end group
	1741	@end example
	1742
	1743	@node Rpcalc Main
	1744	@subsection The Controlling Function
	1745	@cindex controlling function
	1746	@cindex main function in simple example
	1747
	1748	In keeping with the spirit of this example, the controlling function is
	1749	kept to the bare minimum. The only requirement is that it call
	1750	@code{yyparse} to start the process of parsing.
	1751
	1752	@example
	1753	@group
	1754	int
	1755	main (void)
	1756	@{
	1757	return yyparse ();
	1758	@}
	1759	@end group
	1760	@end example
	1761
	1762	@node Rpcalc Error
	1763	@subsection The Error Reporting Routine
	1764	@cindex error reporting routine
	1765
	1766	When @code{yyparse} detects a syntax error, it calls the error reporting
	1767	function @code{yyerror} to print an error message (usually but not
	1768	always @code{"syntax error"}). It is up to the programmer to supply
	1769	@code{yyerror} (@pxref{Interface, ,Parser C-Language Interface}), so
	1770	here is the definition we will use:
	1771
	1772	@example
	1773	@group
	1774	#include <stdio.h>
	1775	@end group
	1776
	1777	@group
	1778	/* Called by yyparse on error. */
	1779	void
	1780	yyerror (char const *s)
	1781	@{
	1782	fprintf (stderr, "%s\n", s);
	1783	@}
	1784	@end group
	1785	@end example
	1786
	1787	After @code{yyerror} returns, the Bison parser may recover from the error
	1788	and continue parsing if the grammar contains a suitable error rule
	1789	(@pxref{Error Recovery}). Otherwise, @code{yyparse} returns nonzero. We
	1790	have not written any error rules in this example, so any invalid input will
	1791	cause the calculator program to exit. This is not clean behavior for a
	1792	real calculator, but it is adequate for the first example.
	1793
	1794	@node Rpcalc Generate
	1795	@subsection Running Bison to Make the Parser
	1796	@cindex running Bison (introduction)
	1797
	1798	Before running Bison to produce a parser, we need to decide how to
	1799	arrange all the source code in one or more source files. For such a
	1800	simple example, the easiest thing is to put everything in one file,
	1801	the grammar file. The definitions of @code{yylex}, @code{yyerror} and
	1802	@code{main} go at the end, in the epilogue of the grammar file
	1803	(@pxref{Grammar Layout, ,The Overall Layout of a Bison Grammar}).
	1804
	1805	For a large project, you would probably have several source files, and use
	1806	@code{make} to arrange to recompile them.
	1807
	1808	With all the source in the grammar file, you use the following command
	1809	to convert it into a parser implementation file:
	1810
	1811	@example
	1812	bison @var{file}.y
	1813	@end example
	1814
	1815	@noindent
	1816	In this example, the grammar file is called @file{rpcalc.y} (for
	1817	``Reverse Polish @sc{calc}ulator''). Bison produces a parser
	1818	implementation file named @file{@var{file}.tab.c}, removing the
	1819	@samp{.y} from the grammar file name. The parser implementation file
	1820	contains the source code for @code{yyparse}. The additional functions
	1821	in the grammar file (@code{yylex}, @code{yyerror} and @code{main}) are
	1822	copied verbatim to the parser implementation file.
	1823
	1824	@node Rpcalc Compile
	1825	@subsection Compiling the Parser Implementation File
	1826	@cindex compiling the parser
	1827
	1828	Here is how to compile and run the parser implementation file:
	1829
	1830	@example
	1831	@group
	1832	# @r{List files in current directory.}
	1833	$ @kbd{ls}
	1834	rpcalc.tab.c rpcalc.y
	1835	@end group
	1836
	1837	@group
	1838	# @r{Compile the Bison parser.}
	1839	# @r{@samp{-lm} tells compiler to search math library for @code{pow}.}
	1840	$ @kbd{cc -lm -o rpcalc rpcalc.tab.c}
	1841	@end group
	1842
	1843	@group
	1844	# @r{List files again.}
	1845	$ @kbd{ls}
	1846	rpcalc rpcalc.tab.c rpcalc.y
	1847	@end group
	1848	@end example
	1849
	1850	The file @file{rpcalc} now contains the executable code. Here is an
	1851	example session using @code{rpcalc}.
	1852
	1853	@example
	1854	$ @kbd{rpcalc}
	1855	@kbd{4 9 +}
	1856	13
	1857	@kbd{3 7 + 3 4 5 *+-}
	1858	-13
	1859	@kbd{3 7 + 3 4 5 * + - n} @r{Note the unary minus, @samp{n}}
	1860	13
	1861	@kbd{5 6 / 4 n +}
	1862	-3.166666667
	1863	@kbd{3 4 ^} @r{Exponentiation}
	1864	81
	1865	@kbd{^D} @r{End-of-file indicator}
	1866	$
	1867	@end example
	1868
	1869	@node Infix Calc
	1870	@section Infix Notation Calculator: @code{calc}
	1871	@cindex infix notation calculator
	1872	@cindex @code{calc}
	1873	@cindex calculator, infix notation
	1874
	1875	We now modify rpcalc to handle infix operators instead of postfix. Infix
	1876	notation involves the concept of operator precedence and the need for
	1877	parentheses nested to arbitrary depth. Here is the Bison code for
	1878	@file{calc.y}, an infix desk-top calculator.
	1879
	1880	@example
	1881	/* Infix notation calculator. */
	1882
	1883	@group
	1884	%@{
	1885	#define YYSTYPE double
	1886	#include <math.h>
	1887	#include <stdio.h>
	1888	int yylex (void);
	1889	void yyerror (char const *);
	1890	%@}
	1891	@end group
	1892
	1893	@group
	1894	/* Bison declarations. */
	1895	%token NUM
	1896	%left '-' '+'
	1897	%left '*' '/'
	1898	%left NEG /* negation--unary minus */
	1899	%right '^' /* exponentiation */
	1900	@end group
	1901
	1902	%% /* The grammar follows. */
	1903	@group
	1904	input:
	1905	/* empty */
	1906	\| input line
	1907	;
	1908	@end group
	1909
	1910	@group
	1911	line:
	1912	'\n'
	1913	\| exp '\n' @{ printf ("\t%.10g\n", $1); @}
	1914	;
	1915	@end group
	1916
	1917	@group
	1918	exp:
	1919	NUM @{ $$ = $1; @}
	1920	\| exp '+' exp @{ $$ = $1 + $3; @}
	1921	\| exp '-' exp @{ $$ = $1 - $3; @}
	1922	\| exp '' exp @{ $$ = $1 $3; @}
	1923	\| exp '/' exp @{ $$ = $1 / $3; @}
	1924	\| '-' exp %prec NEG @{ $$ = -$2; @}
	1925	\| exp '^' exp @{ $$ = pow ($1, $3); @}
	1926	\| '(' exp ')' @{ $$ = $2; @}
	1927	;
	1928	@end group
	1929	%%
	1930	@end example
	1931
	1932	@noindent
	1933	The functions @code{yylex}, @code{yyerror} and @code{main} can be the
	1934	same as before.
	1935
	1936	There are two important new features shown in this code.
	1937
	1938	In the second section (Bison declarations), @code{%left} declares token
	1939	types and says they are left-associative operators. The declarations
	1940	@code{%left} and @code{%right} (right associativity) take the place of
	1941	@code{%token} which is used to declare a token type name without
	1942	associativity. (These tokens are single-character literals, which
	1943	ordinarily don't need to be declared. We declare them here to specify
	1944	the associativity.)
	1945
	1946	Operator precedence is determined by the line ordering of the
	1947	declarations; the higher the line number of the declaration (lower on
	1948	the page or screen), the higher the precedence. Hence, exponentiation
	1949	has the highest precedence, unary minus (@code{NEG}) is next, followed
	1950	by @samp{*} and @samp{/}, and so on. @xref{Precedence, ,Operator
	1951	Precedence}.
	1952
	1953	The other important new feature is the @code{%prec} in the grammar
	1954	section for the unary minus operator. The @code{%prec} simply instructs
	1955	Bison that the rule @samp{\| '-' exp} has the same precedence as
	1956	@code{NEG}---in this case the next-to-highest. @xref{Contextual
	1957	Precedence, ,Context-Dependent Precedence}.
	1958
	1959	Here is a sample run of @file{calc.y}:
	1960
	1961	@need 500
	1962	@example
	1963	$ @kbd{calc}
	1964	@kbd{4 + 4.5 - (34/(8*3+-3))}
	1965	6.880952381
	1966	@kbd{-56 + 2}
	1967	-54
	1968	@kbd{3 ^ 2}
	1969	9
	1970	@end example
	1971
	1972	@node Simple Error Recovery
	1973	@section Simple Error Recovery
	1974	@cindex error recovery, simple
	1975
	1976	Up to this point, this manual has not addressed the issue of @dfn{error
	1977	recovery}---how to continue parsing after the parser detects a syntax
	1978	error. All we have handled is error reporting with @code{yyerror}.
	1979	Recall that by default @code{yyparse} returns after calling
	1980	@code{yyerror}. This means that an erroneous input line causes the
	1981	calculator program to exit. Now we show how to rectify this deficiency.
	1982
	1983	The Bison language itself includes the reserved word @code{error}, which
	1984	may be included in the grammar rules. In the example below it has
	1985	been added to one of the alternatives for @code{line}:
	1986
	1987	@example
	1988	@group
	1989	line:
	1990	'\n'
	1991	\| exp '\n' @{ printf ("\t%.10g\n", $1); @}
	1992	\| error '\n' @{ yyerrok; @}
	1993	;
	1994	@end group
	1995	@end example
	1996
	1997	This addition to the grammar allows for simple error recovery in the
	1998	event of a syntax error. If an expression that cannot be evaluated is
	1999	read, the error will be recognized by the third rule for @code{line},
	2000	and parsing will continue. (The @code{yyerror} function is still called
	2001	upon to print its message as well.) The action executes the statement
	2002	@code{yyerrok}, a macro defined automatically by Bison; its meaning is
	2003	that error recovery is complete (@pxref{Error Recovery}). Note the
	2004	difference between @code{yyerrok} and @code{yyerror}; neither one is a
	2005	misprint.
	2006
	2007	This form of error recovery deals with syntax errors. There are other
	2008	kinds of errors; for example, division by zero, which raises an exception
	2009	signal that is normally fatal. A real calculator program must handle this
	2010	signal and use @code{longjmp} to return to @code{main} and resume parsing
	2011	input lines; it would also have to discard the rest of the current line of
	2012	input. We won't discuss this issue further because it is not specific to
	2013	Bison programs.
	2014
	2015	@node Location Tracking Calc
	2016	@section Location Tracking Calculator: @code{ltcalc}
	2017	@cindex location tracking calculator
	2018	@cindex @code{ltcalc}
	2019	@cindex calculator, location tracking
	2020
	2021	This example extends the infix notation calculator with location
	2022	tracking. This feature will be used to improve the error messages. For
	2023	the sake of clarity, this example is a simple integer calculator, since
	2024	most of the work needed to use locations will be done in the lexical
	2025	analyzer.
	2026
	2027	@menu
	2028	* Ltcalc Declarations:: Bison and C declarations for ltcalc.
	2029	* Ltcalc Rules:: Grammar rules for ltcalc, with explanations.
	2030	* Ltcalc Lexer:: The lexical analyzer.
	2031	@end menu
	2032
	2033	@node Ltcalc Declarations
	2034	@subsection Declarations for @code{ltcalc}
	2035
	2036	The C and Bison declarations for the location tracking calculator are
	2037	the same as the declarations for the infix notation calculator.
	2038
	2039	@example
	2040	/* Location tracking calculator. */
	2041
	2042	%@{
	2043	#define YYSTYPE int
	2044	#include <math.h>
	2045	int yylex (void);
	2046	void yyerror (char const *);
	2047	%@}
	2048
	2049	/* Bison declarations. */
	2050	%token NUM
	2051
	2052	%left '-' '+'
	2053	%left '*' '/'
	2054	%left NEG
	2055	%right '^'
	2056
	2057	%% /* The grammar follows. */
	2058	@end example
	2059
	2060	@noindent
	2061	Note there are no declarations specific to locations. Defining a data
	2062	type for storing locations is not needed: we will use the type provided
	2063	by default (@pxref{Location Type, ,Data Types of Locations}), which is a
	2064	four member structure with the following integer fields:
	2065	@code{first_line}, @code{first_column}, @code{last_line} and
	2066	@code{last_column}. By conventions, and in accordance with the GNU
	2067	Coding Standards and common practice, the line and column count both
	2068	start at 1.
	2069
	2070	@node Ltcalc Rules
	2071	@subsection Grammar Rules for @code{ltcalc}
	2072
	2073	Whether handling locations or not has no effect on the syntax of your
	2074	language. Therefore, grammar rules for this example will be very close
	2075	to those of the previous example: we will only modify them to benefit
	2076	from the new information.
	2077
	2078	Here, we will use locations to report divisions by zero, and locate the
	2079	wrong expressions or subexpressions.
	2080
	2081	@example
	2082	@group
	2083	input:
	2084	/* empty */
	2085	\| input line
	2086	;
	2087	@end group
	2088
	2089	@group
	2090	line:
	2091	'\n'
	2092	\| exp '\n' @{ printf ("%d\n", $1); @}
	2093	;
	2094	@end group
	2095
	2096	@group
	2097	exp:
	2098	NUM @{ $$ = $1; @}
	2099	\| exp '+' exp @{ $$ = $1 + $3; @}
	2100	\| exp '-' exp @{ $$ = $1 - $3; @}
	2101	\| exp '' exp @{ $$ = $1 $3; @}
	2102	@end group
	2103	@group
	2104	\| exp '/' exp
	2105	@{
	2106	if ($3)
	2107	$$ = $1 / $3;
	2108	else
	2109	@{
	2110	$$ = 1;
	2111	fprintf (stderr, "%d.%d-%d.%d: division by zero",
	2112	@@3.first_line, @@3.first_column,
	2113	@@3.last_line, @@3.last_column);
	2114	@}
	2115	@}
	2116	@end group
	2117	@group
	2118	\| '-' exp %prec NEG @{ $$ = -$2; @}
	2119	\| exp '^' exp @{ $$ = pow ($1, $3); @}
	2120	\| '(' exp ')' @{ $$ = $2; @}
	2121	@end group
	2122	@end example
	2123
	2124	This code shows how to reach locations inside of semantic actions, by
	2125	using the pseudo-variables @code{@@@var{n}} for rule components, and the
	2126	pseudo-variable @code{@@$} for groupings.
	2127
	2128	We don't need to assign a value to @code{@@$}: the output parser does it
	2129	automatically. By default, before executing the C code of each action,
	2130	@code{@@$} is set to range from the beginning of @code{@@1} to the end
	2131	of @code{@@@var{n}}, for a rule with @var{n} components. This behavior
	2132	can be redefined (@pxref{Location Default Action, , Default Action for
	2133	Locations}), and for very specific rules, @code{@@$} can be computed by
	2134	hand.
	2135
	2136	@node Ltcalc Lexer
	2137	@subsection The @code{ltcalc} Lexical Analyzer.
	2138
	2139	Until now, we relied on Bison's defaults to enable location
	2140	tracking. The next step is to rewrite the lexical analyzer, and make it
	2141	able to feed the parser with the token locations, as it already does for
	2142	semantic values.
	2143
	2144	To this end, we must take into account every single character of the
	2145	input text, to avoid the computed locations of being fuzzy or wrong:
	2146
	2147	@example
	2148	@group
	2149	int
	2150	yylex (void)
	2151	@{
	2152	int c;
	2153	@end group
	2154
	2155	@group
	2156	/* Skip white space. */
	2157	while ((c = getchar ()) == ' ' \|\| c == '\t')
	2158	++yylloc.last_column;
	2159	@end group
	2160
	2161	@group
	2162	/* Step. */
	2163	yylloc.first_line = yylloc.last_line;
	2164	yylloc.first_column = yylloc.last_column;
	2165	@end group
	2166
	2167	@group
	2168	/* Process numbers. */
	2169	if (isdigit (c))
	2170	@{
	2171	yylval = c - '0';
	2172	++yylloc.last_column;
	2173	while (isdigit (c = getchar ()))
	2174	@{
	2175	++yylloc.last_column;
	2176	yylval = yylval * 10 + c - '0';
	2177	@}
	2178	ungetc (c, stdin);
	2179	return NUM;
	2180	@}
	2181	@end group
	2182
	2183	/* Return end-of-input. */
	2184	if (c == EOF)
	2185	return 0;
	2186
	2187	@group
	2188	/* Return a single char, and update location. */
	2189	if (c == '\n')
	2190	@{
	2191	++yylloc.last_line;
	2192	yylloc.last_column = 0;
	2193	@}
	2194	else
	2195	++yylloc.last_column;
	2196	return c;
	2197	@}
	2198	@end group
	2199	@end example
	2200
	2201	Basically, the lexical analyzer performs the same processing as before:
	2202	it skips blanks and tabs, and reads numbers or single-character tokens.
	2203	In addition, it updates @code{yylloc}, the global variable (of type
	2204	@code{YYLTYPE}) containing the token's location.
	2205
	2206	Now, each time this function returns a token, the parser has its number
	2207	as well as its semantic value, and its location in the text. The last
	2208	needed change is to initialize @code{yylloc}, for example in the
	2209	controlling function:
	2210
	2211	@example
	2212	@group
	2213	int
	2214	main (void)
	2215	@{
	2216	yylloc.first_line = yylloc.last_line = 1;
	2217	yylloc.first_column = yylloc.last_column = 0;
	2218	return yyparse ();
	2219	@}
	2220	@end group
	2221	@end example
	2222
	2223	Remember that computing locations is not a matter of syntax. Every
	2224	character must be associated to a location update, whether it is in
	2225	valid input, in comments, in literal strings, and so on.
	2226
	2227	@node Multi-function Calc
	2228	@section Multi-Function Calculator: @code{mfcalc}
	2229	@cindex multi-function calculator
	2230	@cindex @code{mfcalc}
	2231	@cindex calculator, multi-function
	2232
	2233	Now that the basics of Bison have been discussed, it is time to move on to
	2234	a more advanced problem. The above calculators provided only five
	2235	functions, @samp{+}, @samp{-}, @samp{*}, @samp{/} and @samp{^}. It would
	2236	be nice to have a calculator that provides other mathematical functions such
	2237	as @code{sin}, @code{cos}, etc.
	2238
	2239	It is easy to add new operators to the infix calculator as long as they are
	2240	only single-character literals. The lexical analyzer @code{yylex} passes
	2241	back all nonnumeric characters as tokens, so new grammar rules suffice for
	2242	adding a new operator. But we want something more flexible: built-in
	2243	functions whose syntax has this form:
	2244
	2245	@example
	2246	@var{function_name} (@var{argument})
	2247	@end example
	2248
	2249	@noindent
	2250	At the same time, we will add memory to the calculator, by allowing you
	2251	to create named variables, store values in them, and use them later.
	2252	Here is a sample session with the multi-function calculator:
	2253
	2254	@example
	2255	$ @kbd{mfcalc}
	2256	@kbd{pi = 3.141592653589}
	2257	3.1415926536
	2258	@kbd{sin(pi)}
	2259	0.0000000000
	2260	@kbd{alpha = beta1 = 2.3}
	2261	2.3000000000
	2262	@kbd{alpha}
	2263	2.3000000000
	2264	@kbd{ln(alpha)}
	2265	0.8329091229
	2266	@kbd{exp(ln(beta1))}
	2267	2.3000000000
	2268	$
	2269	@end example
	2270
	2271	Note that multiple assignment and nested function calls are permitted.
	2272
	2273	@menu
	2274	* Mfcalc Declarations:: Bison declarations for multi-function calculator.
	2275	* Mfcalc Rules:: Grammar rules for the calculator.
	2276	* Mfcalc Symbol Table:: Symbol table management subroutines.
	2277	@end menu
	2278
	2279	@node Mfcalc Declarations
	2280	@subsection Declarations for @code{mfcalc}
	2281
	2282	Here are the C and Bison declarations for the multi-function calculator.
	2283
	2284	@comment file: mfcalc.y
	2285	@example
	2286	@group
	2287	%@{
	2288	#include <math.h> /* For math functions, cos(), sin(), etc. */
	2289	#include "calc.h" /* Contains definition of `symrec'. */
	2290	int yylex (void);
	2291	void yyerror (char const *);
	2292	%@}
	2293	@end group
	2294	@group
	2295	%union @{
	2296	double val; /* For returning numbers. */
	2297	symrec tptr; / For returning symbol-table pointers. */
	2298	@}
	2299	@end group
	2300	%token <val> NUM /* Simple double precision number. */
	2301	%token <tptr> VAR FNCT /* Variable and Function. */
	2302	%type <val> exp
	2303
	2304	@group
	2305	%right '='
	2306	%left '-' '+'
	2307	%left '*' '/'
	2308	%left NEG /* negation--unary minus */
	2309	%right '^' /* exponentiation */
	2310	@end group
	2311	%% /* The grammar follows. */
	2312	@end example
	2313
	2314	The above grammar introduces only two new features of the Bison language.
	2315	These features allow semantic values to have various data types
	2316	(@pxref{Multiple Types, ,More Than One Value Type}).
	2317
	2318	The @code{%union} declaration specifies the entire list of possible types;
	2319	this is instead of defining @code{YYSTYPE}. The allowable types are now
	2320	double-floats (for @code{exp} and @code{NUM}) and pointers to entries in
	2321	the symbol table. @xref{Union Decl, ,The Collection of Value Types}.
	2322
	2323	Since values can now have various types, it is necessary to associate a
	2324	type with each grammar symbol whose semantic value is used. These symbols
	2325	are @code{NUM}, @code{VAR}, @code{FNCT}, and @code{exp}. Their
	2326	declarations are augmented with information about their data type (placed
	2327	between angle brackets).
	2328
	2329	The Bison construct @code{%type} is used for declaring nonterminal
	2330	symbols, just as @code{%token} is used for declaring token types. We
	2331	have not used @code{%type} before because nonterminal symbols are
	2332	normally declared implicitly by the rules that define them. But
	2333	@code{exp} must be declared explicitly so we can specify its value type.
	2334	@xref{Type Decl, ,Nonterminal Symbols}.
	2335
	2336	@node Mfcalc Rules
	2337	@subsection Grammar Rules for @code{mfcalc}
	2338
	2339	Here are the grammar rules for the multi-function calculator.
	2340	Most of them are copied directly from @code{calc}; three rules,
	2341	those which mention @code{VAR} or @code{FNCT}, are new.
	2342
	2343	@comment file: mfcalc.y
	2344	@example
	2345	@group
	2346	input:
	2347	/* empty */
	2348	\| input line
	2349	;
	2350	@end group
	2351
	2352	@group
	2353	line:
	2354	'\n'
	2355	\| exp '\n' @{ printf ("%.10g\n", $1); @}
	2356	\| error '\n' @{ yyerrok; @}
	2357	;
	2358	@end group
	2359
	2360	@group
	2361	exp:
	2362	NUM @{ $$ = $1; @}
	2363	\| VAR @{ $$ = $1->value.var; @}
	2364	\| VAR '=' exp @{ $$ = $3; $1->value.var = $3; @}
	2365	\| FNCT '(' exp ')' @{ $$ = (*($1->value.fnctptr))($3); @}
	2366	\| exp '+' exp @{ $$ = $1 + $3; @}
	2367	\| exp '-' exp @{ $$ = $1 - $3; @}
	2368	\| exp '' exp @{ $$ = $1 $3; @}
	2369	\| exp '/' exp @{ $$ = $1 / $3; @}
	2370	\| '-' exp %prec NEG @{ $$ = -$2; @}
	2371	\| exp '^' exp @{ $$ = pow ($1, $3); @}
	2372	\| '(' exp ')' @{ $$ = $2; @}
	2373	;
	2374	@end group
	2375	/* End of grammar. */
	2376	%%
	2377	@end example
	2378
	2379	@node Mfcalc Symbol Table
	2380	@subsection The @code{mfcalc} Symbol Table
	2381	@cindex symbol table example
	2382
	2383	The multi-function calculator requires a symbol table to keep track of the
	2384	names and meanings of variables and functions. This doesn't affect the
	2385	grammar rules (except for the actions) or the Bison declarations, but it
	2386	requires some additional C functions for support.
	2387
	2388	The symbol table itself consists of a linked list of records. Its
	2389	definition, which is kept in the header @file{calc.h}, is as follows. It
	2390	provides for either functions or variables to be placed in the table.
	2391
	2392	@comment file: calc.h
	2393	@example
	2394	@group
	2395	/* Function type. */
	2396	typedef double (*func_t) (double);
	2397	@end group
	2398
	2399	@group
	2400	/* Data type for links in the chain of symbols. */
	2401	struct symrec
	2402	@{
	2403	char name; / name of symbol */
	2404	int type; /* type of symbol: either VAR or FNCT */
	2405	union
	2406	@{
	2407	double var; /* value of a VAR */
	2408	func_t fnctptr; /* value of a FNCT */
	2409	@} value;
	2410	struct symrec next; / link field */
	2411	@};
	2412	@end group
	2413
	2414	@group
	2415	typedef struct symrec symrec;
	2416
	2417	/* The symbol table: a chain of `struct symrec'. */
	2418	extern symrec *sym_table;
	2419
	2420	symrec putsym (char const , int);
	2421	symrec getsym (char const );
	2422	@end group
	2423	@end example
	2424
	2425	The new version of @code{main} includes a call to @code{init_table}, a
	2426	function that initializes the symbol table. Here it is, and
	2427	@code{init_table} as well:
	2428
	2429	@example
	2430	#include <stdio.h>
	2431
	2432	@group
	2433	/* Called by yyparse on error. */
	2434	void
	2435	yyerror (char const *s)
	2436	@{
	2437	printf ("%s\n", s);
	2438	@}
	2439	@end group
	2440
	2441	@group
	2442	struct init
	2443	@{
	2444	char const *fname;
	2445	double (*fnct) (double);
	2446	@};
	2447	@end group
	2448
	2449	@group
	2450	struct init const arith_fncts[] =
	2451	@{
	2452	"sin", sin,
	2453	"cos", cos,
	2454	"atan", atan,
	2455	"ln", log,
	2456	"exp", exp,
	2457	"sqrt", sqrt,
	2458	0, 0
	2459	@};
	2460	@end group
	2461
	2462	@group
	2463	/* The symbol table: a chain of `struct symrec'. */
	2464	symrec *sym_table;
	2465	@end group
	2466
	2467	@group
	2468	/* Put arithmetic functions in table. */
	2469	void
	2470	init_table (void)
	2471	@{
	2472	int i;
	2473	for (i = 0; arith_fncts[i].fname != 0; i++)
	2474	@{
	2475	symrec *ptr = putsym (arith_fncts[i].fname, FNCT);
	2476	ptr->value.fnctptr = arith_fncts[i].fnct;
	2477	@}
	2478	@}
	2479	@end group
	2480
	2481	@group
	2482	int
	2483	main (void)
	2484	@{
	2485	init_table ();
	2486	return yyparse ();
	2487	@}
	2488	@end group
	2489	@end example
	2490
	2491	By simply editing the initialization list and adding the necessary include
	2492	files, you can add additional functions to the calculator.
	2493
	2494	Two important functions allow look-up and installation of symbols in the
	2495	symbol table. The function @code{putsym} is passed a name and the type
	2496	(@code{VAR} or @code{FNCT}) of the object to be installed. The object is
	2497	linked to the front of the list, and a pointer to the object is returned.
	2498	The function @code{getsym} is passed the name of the symbol to look up. If
	2499	found, a pointer to that symbol is returned; otherwise zero is returned.
	2500
	2501	@comment file: mfcalc.y
	2502	@example
	2503	#include <stdlib.h> /* malloc. */
	2504	#include <string.h> /* strlen. */
	2505
	2506	@group
	2507	symrec *
	2508	putsym (char const *sym_name, int sym_type)
	2509	@{
	2510	symrec ptr = (symrec ) malloc (sizeof (symrec));
	2511	ptr->name = (char *) malloc (strlen (sym_name) + 1);
	2512	strcpy (ptr->name,sym_name);
	2513	ptr->type = sym_type;
	2514	ptr->value.var = 0; /* Set value to 0 even if fctn. */
	2515	ptr->next = (struct symrec *)sym_table;
	2516	sym_table = ptr;
	2517	return ptr;
	2518	@}
	2519	@end group
	2520
	2521	@group
	2522	symrec *
	2523	getsym (char const *sym_name)
	2524	@{
	2525	symrec *ptr;
	2526	for (ptr = sym_table; ptr != (symrec *) 0;
	2527	ptr = (symrec *)ptr->next)
	2528	if (strcmp (ptr->name,sym_name) == 0)
	2529	return ptr;
	2530	return 0;
	2531	@}
	2532	@end group
	2533	@end example
	2534
	2535	The function @code{yylex} must now recognize variables, numeric values, and
	2536	the single-character arithmetic operators. Strings of alphanumeric
	2537	characters with a leading letter are recognized as either variables or
	2538	functions depending on what the symbol table says about them.
	2539
	2540	The string is passed to @code{getsym} for look up in the symbol table. If
	2541	the name appears in the table, a pointer to its location and its type
	2542	(@code{VAR} or @code{FNCT}) is returned to @code{yyparse}. If it is not
	2543	already in the table, then it is installed as a @code{VAR} using
	2544	@code{putsym}. Again, a pointer and its type (which must be @code{VAR}) is
	2545	returned to @code{yyparse}.
	2546
	2547	No change is needed in the handling of numeric values and arithmetic
	2548	operators in @code{yylex}.
	2549
	2550	@comment file: mfcalc.y
	2551	@example
	2552	@group
	2553	#include <ctype.h>
	2554	@end group
	2555
	2556	@group
	2557	int
	2558	yylex (void)
	2559	@{
	2560	int c;
	2561
	2562	/* Ignore white space, get first nonwhite character. */
	2563	while ((c = getchar ()) == ' ' \|\| c == '\t')
	2564	continue;
	2565
	2566	if (c == EOF)
	2567	return 0;
	2568	@end group
	2569
	2570	@group
	2571	/* Char starts a number => parse the number. */
	2572	if (c == '.' \|\| isdigit (c))
	2573	@{
	2574	ungetc (c, stdin);
	2575	scanf ("%lf", &yylval.val);
	2576	return NUM;
	2577	@}
	2578	@end group
	2579
	2580	@group
	2581	/* Char starts an identifier => read the name. */
	2582	if (isalpha (c))
	2583	@{
	2584	/* Initially make the buffer long enough
	2585	for a 40-character symbol name. */
	2586	static size_t length = 40;
	2587	static char *symbuf = 0;
	2588	symrec *s;
	2589	int i;
	2590	@end group
	2591
	2592	if (!symbuf)
	2593	symbuf = (char *) malloc (length + 1);
	2594
	2595	i = 0;
	2596	do
	2597	@group
	2598	@{
	2599	/* If buffer is full, make it bigger. */
	2600	if (i == length)
	2601	@{
	2602	length *= 2;
	2603	symbuf = (char *) realloc (symbuf, length + 1);
	2604	@}
	2605	/* Add this character to the buffer. */
	2606	symbuf[i++] = c;
	2607	/* Get another character. */
	2608	c = getchar ();
	2609	@}
	2610	@end group
	2611	@group
	2612	while (isalnum (c));
	2613
	2614	ungetc (c, stdin);
	2615	symbuf[i] = '\0';
	2616	@end group
	2617
	2618	@group
	2619	s = getsym (symbuf);
	2620	if (s == 0)
	2621	s = putsym (symbuf, VAR);
	2622	yylval.tptr = s;
	2623	return s->type;
	2624	@}
	2625
	2626	/* Any other character is a token by itself. */
	2627	return c;
	2628	@}
	2629	@end group
	2630	@end example
	2631
	2632	This program is both powerful and flexible. You may easily add new
	2633	functions, and it is a simple job to modify this code to install
	2634	predefined variables such as @code{pi} or @code{e} as well.
	2635
	2636	@node Exercises
	2637	@section Exercises
	2638	@cindex exercises
	2639
	2640	@enumerate
	2641	@item
	2642	Add some new functions from @file{math.h} to the initialization list.
	2643
	2644	@item
	2645	Add another array that contains constants and their values. Then
	2646	modify @code{init_table} to add these constants to the symbol table.
	2647	It will be easiest to give the constants type @code{VAR}.
	2648
	2649	@item
	2650	Make the program report an error if the user refers to an
	2651	uninitialized variable in any way except to store a value in it.
	2652	@end enumerate
	2653
	2654	@node Grammar File
	2655	@chapter Bison Grammar Files
	2656
	2657	Bison takes as input a context-free grammar specification and produces a
	2658	C-language function that recognizes correct instances of the grammar.
	2659
	2660	The Bison grammar file conventionally has a name ending in @samp{.y}.
	2661	@xref{Invocation, ,Invoking Bison}.
	2662
	2663	@menu
	2664	* Grammar Outline:: Overall layout of the grammar file.
	2665	* Symbols:: Terminal and nonterminal symbols.
	2666	* Rules:: How to write grammar rules.
	2667	* Recursion:: Writing recursive rules.
	2668	* Semantics:: Semantic values and actions.
	2669	* Tracking Locations:: Locations and actions.
	2670	* Named References:: Using named references in actions.
	2671	* Declarations:: All kinds of Bison declarations are described here.
	2672	* Multiple Parsers:: Putting more than one Bison parser in one program.
	2673	@end menu
	2674
	2675	@node Grammar Outline
	2676	@section Outline of a Bison Grammar
	2677
	2678	A Bison grammar file has four main sections, shown here with the
	2679	appropriate delimiters:
	2680
	2681	@example
	2682	%@{
	2683	@var{Prologue}
	2684	%@}
	2685
	2686	@var{Bison declarations}
	2687
	2688	%%
	2689	@var{Grammar rules}
	2690	%%
	2691
	2692	@var{Epilogue}
	2693	@end example
	2694
	2695	Comments enclosed in @samp{/* @dots{} */} may appear in any of the sections.
	2696	As a GNU extension, @samp{//} introduces a comment that
	2697	continues until end of line.
	2698
	2699	@menu
	2700	* Prologue:: Syntax and usage of the prologue.
	2701	* Prologue Alternatives:: Syntax and usage of alternatives to the prologue.
	2702	* Bison Declarations:: Syntax and usage of the Bison declarations section.
	2703	* Grammar Rules:: Syntax and usage of the grammar rules section.
	2704	* Epilogue:: Syntax and usage of the epilogue.
	2705	@end menu
	2706
	2707	@node Prologue
	2708	@subsection The prologue
	2709	@cindex declarations section
	2710	@cindex Prologue
	2711	@cindex declarations
	2712
	2713	The @var{Prologue} section contains macro definitions and declarations
	2714	of functions and variables that are used in the actions in the grammar
	2715	rules. These are copied to the beginning of the parser implementation
	2716	file so that they precede the definition of @code{yyparse}. You can
	2717	use @samp{#include} to get the declarations from a header file. If
	2718	you don't need any C declarations, you may omit the @samp{%@{} and
	2719	@samp{%@}} delimiters that bracket this section.
	2720
	2721	The @var{Prologue} section is terminated by the first occurrence
	2722	of @samp{%@}} that is outside a comment, a string literal, or a
	2723	character constant.
	2724
	2725	You may have more than one @var{Prologue} section, intermixed with the
	2726	@var{Bison declarations}. This allows you to have C and Bison
	2727	declarations that refer to each other. For example, the @code{%union}
	2728	declaration may use types defined in a header file, and you may wish to
	2729	prototype functions that take arguments of type @code{YYSTYPE}. This
	2730	can be done with two @var{Prologue} blocks, one before and one after the
	2731	@code{%union} declaration.
	2732
	2733	@example
	2734	%@{
	2735	#define _GNU_SOURCE
	2736	#include <stdio.h>
	2737	#include "ptypes.h"
	2738	%@}
	2739
	2740	%union @{
	2741	long int n;
	2742	tree t; /* @r{@code{tree} is defined in @file{ptypes.h}.} */
	2743	@}
	2744
	2745	%@{
	2746	static void print_token_value (FILE *, int, YYSTYPE);
	2747	#define YYPRINT(F, N, L) print_token_value (F, N, L)
	2748	%@}
	2749
	2750	@dots{}
	2751	@end example
	2752
	2753	When in doubt, it is usually safer to put prologue code before all
	2754	Bison declarations, rather than after. For example, any definitions
	2755	of feature test macros like @code{_GNU_SOURCE} or
	2756	@code{_POSIX_C_SOURCE} should appear before all Bison declarations, as
	2757	feature test macros can affect the behavior of Bison-generated
	2758	@code{#include} directives.
	2759
	2760	@node Prologue Alternatives
	2761	@subsection Prologue Alternatives
	2762	@cindex Prologue Alternatives
	2763
	2764	@findex %code
	2765	@findex %code requires
	2766	@findex %code provides
	2767	@findex %code top
	2768
	2769	The functionality of @var{Prologue} sections can often be subtle and
	2770	inflexible. As an alternative, Bison provides a @code{%code}
	2771	directive with an explicit qualifier field, which identifies the
	2772	purpose of the code and thus the location(s) where Bison should
	2773	generate it. For C/C++, the qualifier can be omitted for the default
	2774	location, or it can be one of @code{requires}, @code{provides},
	2775	@code{top}. @xref{%code Summary}.
	2776
	2777	Look again at the example of the previous section:
	2778
	2779	@example
	2780	%@{
	2781	#define _GNU_SOURCE
	2782	#include <stdio.h>
	2783	#include "ptypes.h"
	2784	%@}
	2785
	2786	%union @{
	2787	long int n;
	2788	tree t; /* @r{@code{tree} is defined in @file{ptypes.h}.} */
	2789	@}
	2790
	2791	%@{
	2792	static void print_token_value (FILE *, int, YYSTYPE);
	2793	#define YYPRINT(F, N, L) print_token_value (F, N, L)
	2794	%@}
	2795
	2796	@dots{}
	2797	@end example
	2798
	2799	@noindent
	2800	Notice that there are two @var{Prologue} sections here, but there's a
	2801	subtle distinction between their functionality. For example, if you
	2802	decide to override Bison's default definition for @code{YYLTYPE}, in
	2803	which @var{Prologue} section should you write your new definition?
	2804	You should write it in the first since Bison will insert that code
	2805	into the parser implementation file @emph{before} the default
	2806	@code{YYLTYPE} definition. In which @var{Prologue} section should you
	2807	prototype an internal function, @code{trace_token}, that accepts
	2808	@code{YYLTYPE} and @code{yytokentype} as arguments? You should
	2809	prototype it in the second since Bison will insert that code
	2810	@emph{after} the @code{YYLTYPE} and @code{yytokentype} definitions.
	2811
	2812	This distinction in functionality between the two @var{Prologue} sections is
	2813	established by the appearance of the @code{%union} between them.
	2814	This behavior raises a few questions.
	2815	First, why should the position of a @code{%union} affect definitions related to
	2816	@code{YYLTYPE} and @code{yytokentype}?
	2817	Second, what if there is no @code{%union}?
	2818	In that case, the second kind of @var{Prologue} section is not available.
	2819	This behavior is not intuitive.
	2820
	2821	To avoid this subtle @code{%union} dependency, rewrite the example using a
	2822	@code{%code top} and an unqualified @code{%code}.
	2823	Let's go ahead and add the new @code{YYLTYPE} definition and the
	2824	@code{trace_token} prototype at the same time:
	2825
	2826	@example
	2827	%code top @{
	2828	#define _GNU_SOURCE
	2829	#include <stdio.h>
	2830
	2831	/* WARNING: The following code really belongs
	2832	* in a `%code requires'; see below. */
	2833
	2834	#include "ptypes.h"
	2835	#define YYLTYPE YYLTYPE
	2836	typedef struct YYLTYPE
	2837	@{
	2838	int first_line;
	2839	int first_column;
	2840	int last_line;
	2841	int last_column;
	2842	char *filename;
	2843	@} YYLTYPE;
	2844	@}
	2845
	2846	%union @{
	2847	long int n;
	2848	tree t; /* @r{@code{tree} is defined in @file{ptypes.h}.} */
	2849	@}
	2850
	2851	%code @{
	2852	static void print_token_value (FILE *, int, YYSTYPE);
	2853	#define YYPRINT(F, N, L) print_token_value (F, N, L)
	2854	static void trace_token (enum yytokentype token, YYLTYPE loc);
	2855	@}
	2856
	2857	@dots{}
	2858	@end example
	2859
	2860	@noindent
	2861	In this way, @code{%code top} and the unqualified @code{%code} achieve the same
	2862	functionality as the two kinds of @var{Prologue} sections, but it's always
	2863	explicit which kind you intend.
	2864	Moreover, both kinds are always available even in the absence of @code{%union}.
	2865
	2866	The @code{%code top} block above logically contains two parts. The
	2867	first two lines before the warning need to appear near the top of the
	2868	parser implementation file. The first line after the warning is
	2869	required by @code{YYSTYPE} and thus also needs to appear in the parser
	2870	implementation file. However, if you've instructed Bison to generate
	2871	a parser header file (@pxref{Decl Summary, ,%defines}), you probably
	2872	want that line to appear before the @code{YYSTYPE} definition in that
	2873	header file as well. The @code{YYLTYPE} definition should also appear
	2874	in the parser header file to override the default @code{YYLTYPE}
	2875	definition there.
	2876
	2877	In other words, in the @code{%code top} block above, all but the first two
	2878	lines are dependency code required by the @code{YYSTYPE} and @code{YYLTYPE}
	2879	definitions.
	2880	Thus, they belong in one or more @code{%code requires}:
	2881
	2882	@example
	2883	@group
	2884	%code top @{
	2885	#define _GNU_SOURCE
	2886	#include <stdio.h>
	2887	@}
	2888	@end group
	2889
	2890	@group
	2891	%code requires @{
	2892	#include "ptypes.h"
	2893	@}
	2894	@end group
	2895	@group
	2896	%union @{
	2897	long int n;
	2898	tree t; /* @r{@code{tree} is defined in @file{ptypes.h}.} */
	2899	@}
	2900	@end group
	2901
	2902	@group
	2903	%code requires @{
	2904	#define YYLTYPE YYLTYPE
	2905	typedef struct YYLTYPE
	2906	@{
	2907	int first_line;
	2908	int first_column;
	2909	int last_line;
	2910	int last_column;
	2911	char *filename;
	2912	@} YYLTYPE;
	2913	@}
	2914	@end group
	2915
	2916	@group
	2917	%code @{
	2918	static void print_token_value (FILE *, int, YYSTYPE);
	2919	#define YYPRINT(F, N, L) print_token_value (F, N, L)
	2920	static void trace_token (enum yytokentype token, YYLTYPE loc);
	2921	@}
	2922	@end group
	2923
	2924	@dots{}
	2925	@end example
	2926
	2927	@noindent
	2928	Now Bison will insert @code{#include "ptypes.h"} and the new
	2929	@code{YYLTYPE} definition before the Bison-generated @code{YYSTYPE}
	2930	and @code{YYLTYPE} definitions in both the parser implementation file
	2931	and the parser header file. (By the same reasoning, @code{%code
	2932	requires} would also be the appropriate place to write your own
	2933	definition for @code{YYSTYPE}.)
	2934
	2935	When you are writing dependency code for @code{YYSTYPE} and
	2936	@code{YYLTYPE}, you should prefer @code{%code requires} over
	2937	@code{%code top} regardless of whether you instruct Bison to generate
	2938	a parser header file. When you are writing code that you need Bison
	2939	to insert only into the parser implementation file and that has no
	2940	special need to appear at the top of that file, you should prefer the
	2941	unqualified @code{%code} over @code{%code top}. These practices will
	2942	make the purpose of each block of your code explicit to Bison and to
	2943	other developers reading your grammar file. Following these
	2944	practices, we expect the unqualified @code{%code} and @code{%code
	2945	requires} to be the most important of the four @var{Prologue}
	2946	alternatives.
	2947
	2948	At some point while developing your parser, you might decide to
	2949	provide @code{trace_token} to modules that are external to your
	2950	parser. Thus, you might wish for Bison to insert the prototype into
	2951	both the parser header file and the parser implementation file. Since
	2952	this function is not a dependency required by @code{YYSTYPE} or
	2953	@code{YYLTYPE}, it doesn't make sense to move its prototype to a
	2954	@code{%code requires}. More importantly, since it depends upon
	2955	@code{YYLTYPE} and @code{yytokentype}, @code{%code requires} is not
	2956	sufficient. Instead, move its prototype from the unqualified
	2957	@code{%code} to a @code{%code provides}:
	2958
	2959	@example
	2960	@group
	2961	%code top @{
	2962	#define _GNU_SOURCE
	2963	#include <stdio.h>
	2964	@}
	2965	@end group
	2966
	2967	@group
	2968	%code requires @{
	2969	#include "ptypes.h"
	2970	@}
	2971	@end group
	2972	@group
	2973	%union @{
	2974	long int n;
	2975	tree t; /* @r{@code{tree} is defined in @file{ptypes.h}.} */
	2976	@}
	2977	@end group
	2978
	2979	@group
	2980	%code requires @{
	2981	#define YYLTYPE YYLTYPE
	2982	typedef struct YYLTYPE
	2983	@{
	2984	int first_line;
	2985	int first_column;
	2986	int last_line;
	2987	int last_column;
	2988	char *filename;
	2989	@} YYLTYPE;
	2990	@}
	2991	@end group
	2992
	2993	@group
	2994	%code provides @{
	2995	void trace_token (enum yytokentype token, YYLTYPE loc);
	2996	@}
	2997	@end group
	2998
	2999	@group
	3000	%code @{
	3001	static void print_token_value (FILE *, int, YYSTYPE);
	3002	#define YYPRINT(F, N, L) print_token_value (F, N, L)
	3003	@}
	3004	@end group
	3005
	3006	@dots{}
	3007	@end example
	3008
	3009	@noindent
	3010	Bison will insert the @code{trace_token} prototype into both the
	3011	parser header file and the parser implementation file after the
	3012	definitions for @code{yytokentype}, @code{YYLTYPE}, and
	3013	@code{YYSTYPE}.
	3014
	3015	The above examples are careful to write directives in an order that
	3016	reflects the layout of the generated parser implementation and header
	3017	files: @code{%code top}, @code{%code requires}, @code{%code provides},
	3018	and then @code{%code}. While your grammar files may generally be
	3019	easier to read if you also follow this order, Bison does not require
	3020	it. Instead, Bison lets you choose an organization that makes sense
	3021	to you.
	3022
	3023	You may declare any of these directives multiple times in the grammar file.
	3024	In that case, Bison concatenates the contained code in declaration order.
	3025	This is the only way in which the position of one of these directives within
	3026	the grammar file affects its functionality.
	3027
	3028	The result of the previous two properties is greater flexibility in how you may
	3029	organize your grammar file.
	3030	For example, you may organize semantic-type-related directives by semantic
	3031	type:
	3032
	3033	@example
	3034	@group
	3035	%code requires @{ #include "type1.h" @}
	3036	%union @{ type1 field1; @}
	3037	%destructor @{ type1_free ($$); @} <field1>
	3038	%printer @{ type1_print ($$); @} <field1>
	3039	@end group
	3040
	3041	@group
	3042	%code requires @{ #include "type2.h" @}
	3043	%union @{ type2 field2; @}
	3044	%destructor @{ type2_free ($$); @} <field2>
	3045	%printer @{ type2_print ($$); @} <field2>
	3046	@end group
	3047	@end example
	3048
	3049	@noindent
	3050	You could even place each of the above directive groups in the rules section of
	3051	the grammar file next to the set of rules that uses the associated semantic
	3052	type.
	3053	(In the rules section, you must terminate each of those directives with a
	3054	semicolon.)
	3055	And you don't have to worry that some directive (like a @code{%union}) in the
	3056	definitions section is going to adversely affect their functionality in some
	3057	counter-intuitive manner just because it comes first.
	3058	Such an organization is not possible using @var{Prologue} sections.
	3059
	3060	This section has been concerned with explaining the advantages of the four
	3061	@var{Prologue} alternatives over the original Yacc @var{Prologue}.
	3062	However, in most cases when using these directives, you shouldn't need to
	3063	think about all the low-level ordering issues discussed here.
	3064	Instead, you should simply use these directives to label each block of your
	3065	code according to its purpose and let Bison handle the ordering.
	3066	@code{%code} is the most generic label.
	3067	Move code to @code{%code requires}, @code{%code provides}, or @code{%code top}
	3068	as needed.
	3069
	3070	@node Bison Declarations
	3071	@subsection The Bison Declarations Section
	3072	@cindex Bison declarations (introduction)
	3073	@cindex declarations, Bison (introduction)
	3074
	3075	The @var{Bison declarations} section contains declarations that define
	3076	terminal and nonterminal symbols, specify precedence, and so on.
	3077	In some simple grammars you may not need any declarations.
	3078	@xref{Declarations, ,Bison Declarations}.
	3079
	3080	@node Grammar Rules
	3081	@subsection The Grammar Rules Section
	3082	@cindex grammar rules section
	3083	@cindex rules section for grammar
	3084
	3085	The @dfn{grammar rules} section contains one or more Bison grammar
	3086	rules, and nothing else. @xref{Rules, ,Syntax of Grammar Rules}.
	3087
	3088	There must always be at least one grammar rule, and the first
	3089	@samp{%%} (which precedes the grammar rules) may never be omitted even
	3090	if it is the first thing in the file.
	3091
	3092	@node Epilogue
	3093	@subsection The epilogue
	3094	@cindex additional C code section
	3095	@cindex epilogue
	3096	@cindex C code, section for additional
	3097
	3098	The @var{Epilogue} is copied verbatim to the end of the parser
	3099	implementation file, just as the @var{Prologue} is copied to the
	3100	beginning. This is the most convenient place to put anything that you
	3101	want to have in the parser implementation file but which need not come
	3102	before the definition of @code{yyparse}. For example, the definitions
	3103	of @code{yylex} and @code{yyerror} often go here. Because C requires
	3104	functions to be declared before being used, you often need to declare
	3105	functions like @code{yylex} and @code{yyerror} in the Prologue, even
	3106	if you define them in the Epilogue. @xref{Interface, ,Parser
	3107	C-Language Interface}.
	3108
	3109	If the last section is empty, you may omit the @samp{%%} that separates it
	3110	from the grammar rules.
	3111
	3112	The Bison parser itself contains many macros and identifiers whose names
	3113	start with @samp{yy} or @samp{YY}, so it is a good idea to avoid using
	3114	any such names (except those documented in this manual) in the epilogue
	3115	of the grammar file.
	3116
	3117	@node Symbols
	3118	@section Symbols, Terminal and Nonterminal
	3119	@cindex nonterminal symbol
	3120	@cindex terminal symbol
	3121	@cindex token type
	3122	@cindex symbol
	3123
	3124	@dfn{Symbols} in Bison grammars represent the grammatical classifications
	3125	of the language.
	3126
	3127	A @dfn{terminal symbol} (also known as a @dfn{token type}) represents a
	3128	class of syntactically equivalent tokens. You use the symbol in grammar
	3129	rules to mean that a token in that class is allowed. The symbol is
	3130	represented in the Bison parser by a numeric code, and the @code{yylex}
	3131	function returns a token type code to indicate what kind of token has
	3132	been read. You don't need to know what the code value is; you can use
	3133	the symbol to stand for it.
	3134
	3135	A @dfn{nonterminal symbol} stands for a class of syntactically
	3136	equivalent groupings. The symbol name is used in writing grammar rules.
	3137	By convention, it should be all lower case.
	3138
	3139	Symbol names can contain letters, underscores, periods, and non-initial
	3140	digits and dashes. Dashes in symbol names are a GNU extension, incompatible
	3141	with POSIX Yacc. Periods and dashes make symbol names less convenient to
	3142	use with named references, which require brackets around such names
	3143	(@pxref{Named References}). Terminal symbols that contain periods or dashes
	3144	make little sense: since they are not valid symbols (in most programming
	3145	languages) they are not exported as token names.
	3146
	3147	There are three ways of writing terminal symbols in the grammar:
	3148
	3149	@itemize @bullet
	3150	@item
	3151	A @dfn{named token type} is written with an identifier, like an
	3152	identifier in C@. By convention, it should be all upper case. Each
	3153	such name must be defined with a Bison declaration such as
	3154	@code{%token}. @xref{Token Decl, ,Token Type Names}.
	3155
	3156	@item
	3157	@cindex character token
	3158	@cindex literal token
	3159	@cindex single-character literal
	3160	A @dfn{character token type} (or @dfn{literal character token}) is
	3161	written in the grammar using the same syntax used in C for character
	3162	constants; for example, @code{'+'} is a character token type. A
	3163	character token type doesn't need to be declared unless you need to
	3164	specify its semantic value data type (@pxref{Value Type, ,Data Types of
	3165	Semantic Values}), associativity, or precedence (@pxref{Precedence,
	3166	,Operator Precedence}).
	3167
	3168	By convention, a character token type is used only to represent a
	3169	token that consists of that particular character. Thus, the token
	3170	type @code{'+'} is used to represent the character @samp{+} as a
	3171	token. Nothing enforces this convention, but if you depart from it,
	3172	your program will confuse other readers.
	3173
	3174	All the usual escape sequences used in character literals in C can be
	3175	used in Bison as well, but you must not use the null character as a
	3176	character literal because its numeric code, zero, signifies
	3177	end-of-input (@pxref{Calling Convention, ,Calling Convention
	3178	for @code{yylex}}). Also, unlike standard C, trigraphs have no
	3179	special meaning in Bison character literals, nor is backslash-newline
	3180	allowed.
	3181
	3182	@item
	3183	@cindex string token
	3184	@cindex literal string token
	3185	@cindex multicharacter literal
	3186	A @dfn{literal string token} is written like a C string constant; for
	3187	example, @code{"<="} is a literal string token. A literal string token
	3188	doesn't need to be declared unless you need to specify its semantic
	3189	value data type (@pxref{Value Type}), associativity, or precedence
	3190	(@pxref{Precedence}).
	3191
	3192	You can associate the literal string token with a symbolic name as an
	3193	alias, using the @code{%token} declaration (@pxref{Token Decl, ,Token
	3194	Declarations}). If you don't do that, the lexical analyzer has to
	3195	retrieve the token number for the literal string token from the
	3196	@code{yytname} table (@pxref{Calling Convention}).
	3197
	3198	@strong{Warning}: literal string tokens do not work in Yacc.
	3199
	3200	By convention, a literal string token is used only to represent a token
	3201	that consists of that particular string. Thus, you should use the token
	3202	type @code{"<="} to represent the string @samp{<=} as a token. Bison
	3203	does not enforce this convention, but if you depart from it, people who
	3204	read your program will be confused.
	3205
	3206	All the escape sequences used in string literals in C can be used in
	3207	Bison as well, except that you must not use a null character within a
	3208	string literal. Also, unlike Standard C, trigraphs have no special
	3209	meaning in Bison string literals, nor is backslash-newline allowed. A
	3210	literal string token must contain two or more characters; for a token
	3211	containing just one character, use a character token (see above).
	3212	@end itemize
	3213
	3214	How you choose to write a terminal symbol has no effect on its
	3215	grammatical meaning. That depends only on where it appears in rules and
	3216	on when the parser function returns that symbol.
	3217
	3218	The value returned by @code{yylex} is always one of the terminal
	3219	symbols, except that a zero or negative value signifies end-of-input.
	3220	Whichever way you write the token type in the grammar rules, you write
	3221	it the same way in the definition of @code{yylex}. The numeric code
	3222	for a character token type is simply the positive numeric code of the
	3223	character, so @code{yylex} can use the identical value to generate the
	3224	requisite code, though you may need to convert it to @code{unsigned
	3225	char} to avoid sign-extension on hosts where @code{char} is signed.
	3226	Each named token type becomes a C macro in the parser implementation
	3227	file, so @code{yylex} can use the name to stand for the code. (This
	3228	is why periods don't make sense in terminal symbols.) @xref{Calling
	3229	Convention, ,Calling Convention for @code{yylex}}.
	3230
	3231	If @code{yylex} is defined in a separate file, you need to arrange for the
	3232	token-type macro definitions to be available there. Use the @samp{-d}
	3233	option when you run Bison, so that it will write these macro definitions
	3234	into a separate header file @file{@var{name}.tab.h} which you can include
	3235	in the other source files that need it. @xref{Invocation, ,Invoking Bison}.
	3236
	3237	If you want to write a grammar that is portable to any Standard C
	3238	host, you must use only nonnull character tokens taken from the basic
	3239	execution character set of Standard C@. This set consists of the ten
	3240	digits, the 52 lower- and upper-case English letters, and the
	3241	characters in the following C-language string:
	3242
	3243	@example
	3244	"\a\b\t\n\v\f\r !\"#%&'()*+,-./:;<=>?[\\]^_@{\|@}~"
	3245	@end example
	3246
	3247	The @code{yylex} function and Bison must use a consistent character set
	3248	and encoding for character tokens. For example, if you run Bison in an
	3249	ASCII environment, but then compile and run the resulting
	3250	program in an environment that uses an incompatible character set like
	3251	EBCDIC, the resulting program may not work because the tables
	3252	generated by Bison will assume ASCII numeric values for
	3253	character tokens. It is standard practice for software distributions to
	3254	contain C source files that were generated by Bison in an
	3255	ASCII environment, so installers on platforms that are
	3256	incompatible with ASCII must rebuild those files before
	3257	compiling them.
	3258
	3259	The symbol @code{error} is a terminal symbol reserved for error recovery
	3260	(@pxref{Error Recovery}); you shouldn't use it for any other purpose.
	3261	In particular, @code{yylex} should never return this value. The default
	3262	value of the error token is 256, unless you explicitly assigned 256 to
	3263	one of your tokens with a @code{%token} declaration.
	3264
	3265	@node Rules
	3266	@section Syntax of Grammar Rules
	3267	@cindex rule syntax
	3268	@cindex grammar rule syntax
	3269	@cindex syntax of grammar rules
	3270
	3271	A Bison grammar rule has the following general form:
	3272
	3273	@example
	3274	@group
	3275	@var{result}: @var{components}@dots{};
	3276	@end group
	3277	@end example
	3278
	3279	@noindent
	3280	where @var{result} is the nonterminal symbol that this rule describes,
	3281	and @var{components} are various terminal and nonterminal symbols that
	3282	are put together by this rule (@pxref{Symbols}).
	3283
	3284	For example,
	3285
	3286	@example
	3287	@group
	3288	exp: exp '+' exp;
	3289	@end group
	3290	@end example
	3291
	3292	@noindent
	3293	says that two groupings of type @code{exp}, with a @samp{+} token in between,
	3294	can be combined into a larger grouping of type @code{exp}.
	3295
	3296	White space in rules is significant only to separate symbols. You can add
	3297	extra white space as you wish.
	3298
	3299	Scattered among the components can be @var{actions} that determine
	3300	the semantics of the rule. An action looks like this:
	3301
	3302	@example
	3303	@{@var{C statements}@}
	3304	@end example
	3305
	3306	@noindent
	3307	@cindex braced code
	3308	This is an example of @dfn{braced code}, that is, C code surrounded by
	3309	braces, much like a compound statement in C@. Braced code can contain
	3310	any sequence of C tokens, so long as its braces are balanced. Bison
	3311	does not check the braced code for correctness directly; it merely
	3312	copies the code to the parser implementation file, where the C
	3313	compiler can check it.
	3314
	3315	Within braced code, the balanced-brace count is not affected by braces
	3316	within comments, string literals, or character constants, but it is
	3317	affected by the C digraphs @samp{<%} and @samp{%>} that represent
	3318	braces. At the top level braced code must be terminated by @samp{@}}
	3319	and not by a digraph. Bison does not look for trigraphs, so if braced
	3320	code uses trigraphs you should ensure that they do not affect the
	3321	nesting of braces or the boundaries of comments, string literals, or
	3322	character constants.
	3323
	3324	Usually there is only one action and it follows the components.
	3325	@xref{Actions}.
	3326
	3327	@findex \|
	3328	Multiple rules for the same @var{result} can be written separately or can
	3329	be joined with the vertical-bar character @samp{\|} as follows:
	3330
	3331	@example
	3332	@group
	3333	@var{result}:
	3334	@var{rule1-components}@dots{}
	3335	\| @var{rule2-components}@dots{}
	3336	@dots{}
	3337	;
	3338	@end group
	3339	@end example
	3340
	3341	@noindent
	3342	They are still considered distinct rules even when joined in this way.
	3343
	3344	If @var{components} in a rule is empty, it means that @var{result} can
	3345	match the empty string. For example, here is how to define a
	3346	comma-separated sequence of zero or more @code{exp} groupings:
	3347
	3348	@example
	3349	@group
	3350	expseq:
	3351	/* empty */
	3352	\| expseq1
	3353	;
	3354	@end group
	3355
	3356	@group
	3357	expseq1:
	3358	exp
	3359	\| expseq1 ',' exp
	3360	;
	3361	@end group
	3362	@end example
	3363
	3364	@noindent
	3365	It is customary to write a comment @samp{/* empty */} in each rule
	3366	with no components.
	3367
	3368	@node Recursion
	3369	@section Recursive Rules
	3370	@cindex recursive rule
	3371
	3372	A rule is called @dfn{recursive} when its @var{result} nonterminal
	3373	appears also on its right hand side. Nearly all Bison grammars need to
	3374	use recursion, because that is the only way to define a sequence of any
	3375	number of a particular thing. Consider this recursive definition of a
	3376	comma-separated sequence of one or more expressions:
	3377
	3378	@example
	3379	@group
	3380	expseq1:
	3381	exp
	3382	\| expseq1 ',' exp
	3383	;
	3384	@end group
	3385	@end example
	3386
	3387	@cindex left recursion
	3388	@cindex right recursion
	3389	@noindent
	3390	Since the recursive use of @code{expseq1} is the leftmost symbol in the
	3391	right hand side, we call this @dfn{left recursion}. By contrast, here
	3392	the same construct is defined using @dfn{right recursion}:
	3393
	3394	@example
	3395	@group
	3396	expseq1:
	3397	exp
	3398	\| exp ',' expseq1
	3399	;
	3400	@end group
	3401	@end example
	3402
	3403	@noindent
	3404	Any kind of sequence can be defined using either left recursion or right
	3405	recursion, but you should always use left recursion, because it can
	3406	parse a sequence of any number of elements with bounded stack space.
	3407	Right recursion uses up space on the Bison stack in proportion to the
	3408	number of elements in the sequence, because all the elements must be
	3409	shifted onto the stack before the rule can be applied even once.
	3410	@xref{Algorithm, ,The Bison Parser Algorithm}, for further explanation
	3411	of this.
	3412
	3413	@cindex mutual recursion
	3414	@dfn{Indirect} or @dfn{mutual} recursion occurs when the result of the
	3415	rule does not appear directly on its right hand side, but does appear
	3416	in rules for other nonterminals which do appear on its right hand
	3417	side.
	3418
	3419	For example:
	3420
	3421	@example
	3422	@group
	3423	expr:
	3424	primary
	3425	\| primary '+' primary
	3426	;
	3427	@end group
	3428
	3429	@group
	3430	primary:
	3431	constant
	3432	\| '(' expr ')'
	3433	;
	3434	@end group
	3435	@end example
	3436
	3437	@noindent
	3438	defines two mutually-recursive nonterminals, since each refers to the
	3439	other.
	3440
	3441	@node Semantics
	3442	@section Defining Language Semantics
	3443	@cindex defining language semantics
	3444	@cindex language semantics, defining
	3445
	3446	The grammar rules for a language determine only the syntax. The semantics
	3447	are determined by the semantic values associated with various tokens and
	3448	groupings, and by the actions taken when various groupings are recognized.
	3449
	3450	For example, the calculator calculates properly because the value
	3451	associated with each expression is the proper number; it adds properly
	3452	because the action for the grouping @w{@samp{@var{x} + @var{y}}} is to add
	3453	the numbers associated with @var{x} and @var{y}.
	3454
	3455	@menu
	3456	* Value Type:: Specifying one data type for all semantic values.
	3457	* Multiple Types:: Specifying several alternative data types.
	3458	* Actions:: An action is the semantic definition of a grammar rule.
	3459	* Action Types:: Specifying data types for actions to operate on.
	3460	* Mid-Rule Actions:: Most actions go at the end of a rule.
	3461	This says when, why and how to use the exceptional
	3462	action in the middle of a rule.
	3463	@end menu
	3464
	3465	@node Value Type
	3466	@subsection Data Types of Semantic Values
	3467	@cindex semantic value type
	3468	@cindex value type, semantic
	3469	@cindex data types of semantic values
	3470	@cindex default data type
	3471
	3472	In a simple program it may be sufficient to use the same data type for
	3473	the semantic values of all language constructs. This was true in the
	3474	RPN and infix calculator examples (@pxref{RPN Calc, ,Reverse Polish
	3475	Notation Calculator}).
	3476
	3477	Bison normally uses the type @code{int} for semantic values if your
	3478	program uses the same data type for all language constructs. To
	3479	specify some other type, define @code{YYSTYPE} as a macro, like this:
	3480
	3481	@example
	3482	#define YYSTYPE double
	3483	@end example
	3484
	3485	@noindent
	3486	@code{YYSTYPE}'s replacement list should be a type name
	3487	that does not contain parentheses or square brackets.
	3488	This macro definition must go in the prologue of the grammar file
	3489	(@pxref{Grammar Outline, ,Outline of a Bison Grammar}).
	3490
	3491	@node Multiple Types
	3492	@subsection More Than One Value Type
	3493
	3494	In most programs, you will need different data types for different kinds
	3495	of tokens and groupings. For example, a numeric constant may need type
	3496	@code{int} or @code{long int}, while a string constant needs type
	3497	@code{char *}, and an identifier might need a pointer to an entry in the
	3498	symbol table.
	3499
	3500	To use more than one data type for semantic values in one parser, Bison
	3501	requires you to do two things:
	3502
	3503	@itemize @bullet
	3504	@item
	3505	Specify the entire collection of possible data types, either by using the
	3506	@code{%union} Bison declaration (@pxref{Union Decl, ,The Collection of
	3507	Value Types}), or by using a @code{typedef} or a @code{#define} to
	3508	define @code{YYSTYPE} to be a union type whose member names are
	3509	the type tags.
	3510
	3511	@item
	3512	Choose one of those types for each symbol (terminal or nonterminal) for
	3513	which semantic values are used. This is done for tokens with the
	3514	@code{%token} Bison declaration (@pxref{Token Decl, ,Token Type Names})
	3515	and for groupings with the @code{%type} Bison declaration (@pxref{Type
	3516	Decl, ,Nonterminal Symbols}).
	3517	@end itemize
	3518
	3519	@node Actions
	3520	@subsection Actions
	3521	@cindex action
	3522	@vindex $$
	3523	@vindex $@var{n}
	3524	@vindex $@var{name}
	3525	@vindex $[@var{name}]
	3526
	3527	An action accompanies a syntactic rule and contains C code to be executed
	3528	each time an instance of that rule is recognized. The task of most actions
	3529	is to compute a semantic value for the grouping built by the rule from the
	3530	semantic values associated with tokens or smaller groupings.
	3531
	3532	An action consists of braced code containing C statements, and can be
	3533	placed at any position in the rule;
	3534	it is executed at that position. Most rules have just one action at the
	3535	end of the rule, following all the components. Actions in the middle of
	3536	a rule are tricky and used only for special purposes (@pxref{Mid-Rule
	3537	Actions, ,Actions in Mid-Rule}).
	3538
	3539	The C code in an action can refer to the semantic values of the
	3540	components matched by the rule with the construct @code{$@var{n}},
	3541	which stands for the value of the @var{n}th component. The semantic
	3542	value for the grouping being constructed is @code{$$}. In addition,
	3543	the semantic values of symbols can be accessed with the named
	3544	references construct @code{$@var{name}} or @code{$[@var{name}]}.
	3545	Bison translates both of these constructs into expressions of the
	3546	appropriate type when it copies the actions into the parser
	3547	implementation file. @code{$$} (or @code{$@var{name}}, when it stands
	3548	for the current grouping) is translated to a modifiable lvalue, so it
	3549	can be assigned to.
	3550
	3551	Here is a typical example:
	3552
	3553	@example
	3554	@group
	3555	exp:
	3556	@dots{}
	3557	\| exp '+' exp @{ $$ = $1 + $3; @}
	3558	@end group
	3559	@end example
	3560
	3561	Or, in terms of named references:
	3562
	3563	@example
	3564	@group
	3565	exp[result]:
	3566	@dots{}
	3567	\| exp[left] '+' exp[right] @{ $result = $left + $right; @}
	3568	@end group
	3569	@end example
	3570
	3571	@noindent
	3572	This rule constructs an @code{exp} from two smaller @code{exp} groupings
	3573	connected by a plus-sign token. In the action, @code{$1} and @code{$3}
	3574	(@code{$left} and @code{$right})
	3575	refer to the semantic values of the two component @code{exp} groupings,
	3576	which are the first and third symbols on the right hand side of the rule.
	3577	The sum is stored into @code{$$} (@code{$result}) so that it becomes the
	3578	semantic value of
	3579	the addition-expression just recognized by the rule. If there were a
	3580	useful semantic value associated with the @samp{+} token, it could be
	3581	referred to as @code{$2}.
	3582
	3583	@xref{Named References}, for more information about using the named
	3584	references construct.
	3585
	3586	Note that the vertical-bar character @samp{\|} is really a rule
	3587	separator, and actions are attached to a single rule. This is a
	3588	difference with tools like Flex, for which @samp{\|} stands for either
	3589	``or'', or ``the same action as that of the next rule''. In the
	3590	following example, the action is triggered only when @samp{b} is found:
	3591
	3592	@example
	3593	@group
	3594	a-or-b: 'a'\|'b' @{ a_or_b_found = 1; @};
	3595	@end group
	3596	@end example
	3597
	3598	@cindex default action
	3599	If you don't specify an action for a rule, Bison supplies a default:
	3600	@w{@code{$$ = $1}.} Thus, the value of the first symbol in the rule
	3601	becomes the value of the whole rule. Of course, the default action is
	3602	valid only if the two data types match. There is no meaningful default
	3603	action for an empty rule; every empty rule must have an explicit action
	3604	unless the rule's value does not matter.
	3605
	3606	@code{$@var{n}} with @var{n} zero or negative is allowed for reference
	3607	to tokens and groupings on the stack @emph{before} those that match the
	3608	current rule. This is a very risky practice, and to use it reliably
	3609	you must be certain of the context in which the rule is applied. Here
	3610	is a case in which you can use this reliably:
	3611
	3612	@example
	3613	@group
	3614	foo:
	3615	expr bar '+' expr @{ @dots{} @}
	3616	\| expr bar '-' expr @{ @dots{} @}
	3617	;
	3618	@end group
	3619
	3620	@group
	3621	bar:
	3622	/* empty */ @{ previous_expr = $0; @}
	3623	;
	3624	@end group
	3625	@end example
	3626
	3627	As long as @code{bar} is used only in the fashion shown here, @code{$0}
	3628	always refers to the @code{expr} which precedes @code{bar} in the
	3629	definition of @code{foo}.
	3630
	3631	@vindex yylval
	3632	It is also possible to access the semantic value of the lookahead token, if
	3633	any, from a semantic action.
	3634	This semantic value is stored in @code{yylval}.
	3635	@xref{Action Features, ,Special Features for Use in Actions}.
	3636
	3637	@node Action Types
	3638	@subsection Data Types of Values in Actions
	3639	@cindex action data types
	3640	@cindex data types in actions
	3641
	3642	If you have chosen a single data type for semantic values, the @code{$$}
	3643	and @code{$@var{n}} constructs always have that data type.
	3644
	3645	If you have used @code{%union} to specify a variety of data types, then you
	3646	must declare a choice among these types for each terminal or nonterminal
	3647	symbol that can have a semantic value. Then each time you use @code{$$} or
	3648	@code{$@var{n}}, its data type is determined by which symbol it refers to
	3649	in the rule. In this example,
	3650
	3651	@example
	3652	@group
	3653	exp:
	3654	@dots{}
	3655	\| exp '+' exp @{ $$ = $1 + $3; @}
	3656	@end group
	3657	@end example
	3658
	3659	@noindent
	3660	@code{$1} and @code{$3} refer to instances of @code{exp}, so they all
	3661	have the data type declared for the nonterminal symbol @code{exp}. If
	3662	@code{$2} were used, it would have the data type declared for the
	3663	terminal symbol @code{'+'}, whatever that might be.
	3664
	3665	Alternatively, you can specify the data type when you refer to the value,
	3666	by inserting @samp{<@var{type}>} after the @samp{$} at the beginning of the
	3667	reference. For example, if you have defined types as shown here:
	3668
	3669	@example
	3670	@group
	3671	%union @{
	3672	int itype;
	3673	double dtype;
	3674	@}
	3675	@end group
	3676	@end example
	3677
	3678	@noindent
	3679	then you can write @code{$<itype>1} to refer to the first subunit of the
	3680	rule as an integer, or @code{$<dtype>1} to refer to it as a double.
	3681
	3682	@node Mid-Rule Actions
	3683	@subsection Actions in Mid-Rule
	3684	@cindex actions in mid-rule
	3685	@cindex mid-rule actions
	3686
	3687	Occasionally it is useful to put an action in the middle of a rule.
	3688	These actions are written just like usual end-of-rule actions, but they
	3689	are executed before the parser even recognizes the following components.
	3690
	3691	A mid-rule action may refer to the components preceding it using
	3692	@code{$@var{n}}, but it may not refer to subsequent components because
	3693	it is run before they are parsed.
	3694
	3695	The mid-rule action itself counts as one of the components of the rule.
	3696	This makes a difference when there is another action later in the same rule
	3697	(and usually there is another at the end): you have to count the actions
	3698	along with the symbols when working out which number @var{n} to use in
	3699	@code{$@var{n}}.
	3700
	3701	The mid-rule action can also have a semantic value. The action can set
	3702	its value with an assignment to @code{$$}, and actions later in the rule
	3703	can refer to the value using @code{$@var{n}}. Since there is no symbol
	3704	to name the action, there is no way to declare a data type for the value
	3705	in advance, so you must use the @samp{$<@dots{}>@var{n}} construct to
	3706	specify a data type each time you refer to this value.
	3707
	3708	There is no way to set the value of the entire rule with a mid-rule
	3709	action, because assignments to @code{$$} do not have that effect. The
	3710	only way to set the value for the entire rule is with an ordinary action
	3711	at the end of the rule.
	3712
	3713	Here is an example from a hypothetical compiler, handling a @code{let}
	3714	statement that looks like @samp{let (@var{variable}) @var{statement}} and
	3715	serves to create a variable named @var{variable} temporarily for the
	3716	duration of @var{statement}. To parse this construct, we must put
	3717	@var{variable} into the symbol table while @var{statement} is parsed, then
	3718	remove it afterward. Here is how it is done:
	3719
	3720	@example
	3721	@group
	3722	stmt:
	3723	LET '(' var ')'
	3724	@{ $<context>$ = push_context (); declare_variable ($3); @}
	3725	stmt
	3726	@{ $$ = $6; pop_context ($<context>5); @}
	3727	@end group
	3728	@end example
	3729
	3730	@noindent
	3731	As soon as @samp{let (@var{variable})} has been recognized, the first
	3732	action is run. It saves a copy of the current semantic context (the
	3733	list of accessible variables) as its semantic value, using alternative
	3734	@code{context} in the data-type union. Then it calls
	3735	@code{declare_variable} to add the new variable to that list. Once the
	3736	first action is finished, the embedded statement @code{stmt} can be
	3737	parsed. Note that the mid-rule action is component number 5, so the
	3738	@samp{stmt} is component number 6.
	3739
	3740	After the embedded statement is parsed, its semantic value becomes the
	3741	value of the entire @code{let}-statement. Then the semantic value from the
	3742	earlier action is used to restore the prior list of variables. This
	3743	removes the temporary @code{let}-variable from the list so that it won't
	3744	appear to exist while the rest of the program is parsed.
	3745
	3746	@findex %destructor
	3747	@cindex discarded symbols, mid-rule actions
	3748	@cindex error recovery, mid-rule actions
	3749	In the above example, if the parser initiates error recovery (@pxref{Error
	3750	Recovery}) while parsing the tokens in the embedded statement @code{stmt},
	3751	it might discard the previous semantic context @code{$<context>5} without
	3752	restoring it.
	3753	Thus, @code{$<context>5} needs a destructor (@pxref{Destructor Decl, , Freeing
	3754	Discarded Symbols}).
	3755	However, Bison currently provides no means to declare a destructor specific to
	3756	a particular mid-rule action's semantic value.
	3757
	3758	One solution is to bury the mid-rule action inside a nonterminal symbol and to
	3759	declare a destructor for that symbol:
	3760
	3761	@example
	3762	@group
	3763	%type <context> let
	3764	%destructor @{ pop_context ($$); @} let
	3765
	3766	%%
	3767
	3768	stmt:
	3769	let stmt
	3770	@{
	3771	$$ = $2;
	3772	pop_context ($1);
	3773	@};
	3774
	3775	let:
	3776	LET '(' var ')'
	3777	@{
	3778	$$ = push_context ();
	3779	declare_variable ($3);
	3780	@};
	3781
	3782	@end group
	3783	@end example
	3784
	3785	@noindent
	3786	Note that the action is now at the end of its rule.
	3787	Any mid-rule action can be converted to an end-of-rule action in this way, and
	3788	this is what Bison actually does to implement mid-rule actions.
	3789
	3790	Taking action before a rule is completely recognized often leads to
	3791	conflicts since the parser must commit to a parse in order to execute the
	3792	action. For example, the following two rules, without mid-rule actions,
	3793	can coexist in a working parser because the parser can shift the open-brace
	3794	token and look at what follows before deciding whether there is a
	3795	declaration or not:
	3796
	3797	@example
	3798	@group
	3799	compound:
	3800	'@{' declarations statements '@}'
	3801	\| '@{' statements '@}'
	3802	;
	3803	@end group
	3804	@end example
	3805
	3806	@noindent
	3807	But when we add a mid-rule action as follows, the rules become nonfunctional:
	3808
	3809	@example
	3810	@group
	3811	compound:
	3812	@{ prepare_for_local_variables (); @}
	3813	'@{' declarations statements '@}'
	3814	@end group
	3815	@group
	3816	\| '@{' statements '@}'
	3817	;
	3818	@end group
	3819	@end example
	3820
	3821	@noindent
	3822	Now the parser is forced to decide whether to run the mid-rule action
	3823	when it has read no farther than the open-brace. In other words, it
	3824	must commit to using one rule or the other, without sufficient
	3825	information to do it correctly. (The open-brace token is what is called
	3826	the @dfn{lookahead} token at this time, since the parser is still
	3827	deciding what to do about it. @xref{Lookahead, ,Lookahead Tokens}.)
	3828
	3829	You might think that you could correct the problem by putting identical
	3830	actions into the two rules, like this:
	3831
	3832	@example
	3833	@group
	3834	compound:
	3835	@{ prepare_for_local_variables (); @}
	3836	'@{' declarations statements '@}'
	3837	\| @{ prepare_for_local_variables (); @}
	3838	'@{' statements '@}'
	3839	;
	3840	@end group
	3841	@end example
	3842
	3843	@noindent
	3844	But this does not help, because Bison does not realize that the two actions
	3845	are identical. (Bison never tries to understand the C code in an action.)
	3846
	3847	If the grammar is such that a declaration can be distinguished from a
	3848	statement by the first token (which is true in C), then one solution which
	3849	does work is to put the action after the open-brace, like this:
	3850
	3851	@example
	3852	@group
	3853	compound:
	3854	'@{' @{ prepare_for_local_variables (); @}
	3855	declarations statements '@}'
	3856	\| '@{' statements '@}'
	3857	;
	3858	@end group
	3859	@end example
	3860
	3861	@noindent
	3862	Now the first token of the following declaration or statement,
	3863	which would in any case tell Bison which rule to use, can still do so.
	3864
	3865	Another solution is to bury the action inside a nonterminal symbol which
	3866	serves as a subroutine:
	3867
	3868	@example
	3869	@group
	3870	subroutine:
	3871	/* empty */ @{ prepare_for_local_variables (); @}
	3872	;
	3873	@end group
	3874
	3875	@group
	3876	compound:
	3877	subroutine '@{' declarations statements '@}'
	3878	\| subroutine '@{' statements '@}'
	3879	;
	3880	@end group
	3881	@end example
	3882
	3883	@noindent
	3884	Now Bison can execute the action in the rule for @code{subroutine} without
	3885	deciding which rule for @code{compound} it will eventually use.
	3886
	3887	@node Tracking Locations
	3888	@section Tracking Locations
	3889	@cindex location
	3890	@cindex textual location
	3891	@cindex location, textual
	3892
	3893	Though grammar rules and semantic actions are enough to write a fully
	3894	functional parser, it can be useful to process some additional information,
	3895	especially symbol locations.
	3896
	3897	The way locations are handled is defined by providing a data type, and
	3898	actions to take when rules are matched.
	3899
	3900	@menu
	3901	* Location Type:: Specifying a data type for locations.
	3902	* Actions and Locations:: Using locations in actions.
	3903	* Location Default Action:: Defining a general way to compute locations.
	3904	@end menu
	3905
	3906	@node Location Type
	3907	@subsection Data Type of Locations
	3908	@cindex data type of locations
	3909	@cindex default location type
	3910
	3911	Defining a data type for locations is much simpler than for semantic values,
	3912	since all tokens and groupings always use the same type.
	3913
	3914	You can specify the type of locations by defining a macro called
	3915	@code{YYLTYPE}, just as you can specify the semantic value type by
	3916	defining a @code{YYSTYPE} macro (@pxref{Value Type}).
	3917	When @code{YYLTYPE} is not defined, Bison uses a default structure type with
	3918	four members:
	3919
	3920	@example
	3921	typedef struct YYLTYPE
	3922	@{
	3923	int first_line;
	3924	int first_column;
	3925	int last_line;
	3926	int last_column;
	3927	@} YYLTYPE;
	3928	@end example
	3929
	3930	When @code{YYLTYPE} is not defined, at the beginning of the parsing, Bison
	3931	initializes all these fields to 1 for @code{yylloc}. To initialize
	3932	@code{yylloc} with a custom location type (or to chose a different
	3933	initialization), use the @code{%initial-action} directive. @xref{Initial
	3934	Action Decl, , Performing Actions before Parsing}.
	3935
	3936	@node Actions and Locations
	3937	@subsection Actions and Locations
	3938	@cindex location actions
	3939	@cindex actions, location
	3940	@vindex @@$
	3941	@vindex @@@var{n}
	3942	@vindex @@@var{name}
	3943	@vindex @@[@var{name}]
	3944
	3945	Actions are not only useful for defining language semantics, but also for
	3946	describing the behavior of the output parser with locations.
	3947
	3948	The most obvious way for building locations of syntactic groupings is very
	3949	similar to the way semantic values are computed. In a given rule, several
	3950	constructs can be used to access the locations of the elements being matched.
	3951	The location of the @var{n}th component of the right hand side is
	3952	@code{@@@var{n}}, while the location of the left hand side grouping is
	3953	@code{@@$}.
	3954
	3955	In addition, the named references construct @code{@@@var{name}} and
	3956	@code{@@[@var{name}]} may also be used to address the symbol locations.
	3957	@xref{Named References}, for more information about using the named
	3958	references construct.
	3959
	3960	Here is a basic example using the default data type for locations:
	3961
	3962	@example
	3963	@group
	3964	exp:
	3965	@dots{}
	3966	\| exp '/' exp
	3967	@{
	3968	@@$.first_column = @@1.first_column;
	3969	@@$.first_line = @@1.first_line;
	3970	@@$.last_column = @@3.last_column;
	3971	@@$.last_line = @@3.last_line;
	3972	if ($3)
	3973	$$ = $1 / $3;
	3974	else
	3975	@{
	3976	$$ = 1;
	3977	fprintf (stderr,
	3978	"Division by zero, l%d,c%d-l%d,c%d",
	3979	@@3.first_line, @@3.first_column,
	3980	@@3.last_line, @@3.last_column);
	3981	@}
	3982	@}
	3983	@end group
	3984	@end example
	3985
	3986	As for semantic values, there is a default action for locations that is
	3987	run each time a rule is matched. It sets the beginning of @code{@@$} to the
	3988	beginning of the first symbol, and the end of @code{@@$} to the end of the
	3989	last symbol.
	3990
	3991	With this default action, the location tracking can be fully automatic. The
	3992	example above simply rewrites this way:
	3993
	3994	@example
	3995	@group
	3996	exp:
	3997	@dots{}
	3998	\| exp '/' exp
	3999	@{
	4000	if ($3)
	4001	$$ = $1 / $3;
	4002	else
	4003	@{
	4004	$$ = 1;
	4005	fprintf (stderr,
	4006	"Division by zero, l%d,c%d-l%d,c%d",
	4007	@@3.first_line, @@3.first_column,
	4008	@@3.last_line, @@3.last_column);
	4009	@}
	4010	@}
	4011	@end group
	4012	@end example
	4013
	4014	@vindex yylloc
	4015	It is also possible to access the location of the lookahead token, if any,
	4016	from a semantic action.
	4017	This location is stored in @code{yylloc}.
	4018	@xref{Action Features, ,Special Features for Use in Actions}.
	4019
	4020	@node Location Default Action
	4021	@subsection Default Action for Locations
	4022	@vindex YYLLOC_DEFAULT
	4023	@cindex GLR parsers and @code{YYLLOC_DEFAULT}
	4024
	4025	Actually, actions are not the best place to compute locations. Since
	4026	locations are much more general than semantic values, there is room in
	4027	the output parser to redefine the default action to take for each
	4028	rule. The @code{YYLLOC_DEFAULT} macro is invoked each time a rule is
	4029	matched, before the associated action is run. It is also invoked
	4030	while processing a syntax error, to compute the error's location.
	4031	Before reporting an unresolvable syntactic ambiguity, a GLR
	4032	parser invokes @code{YYLLOC_DEFAULT} recursively to compute the location
	4033	of that ambiguity.
	4034
	4035	Most of the time, this macro is general enough to suppress location
	4036	dedicated code from semantic actions.
	4037
	4038	The @code{YYLLOC_DEFAULT} macro takes three parameters. The first one is
	4039	the location of the grouping (the result of the computation). When a
	4040	rule is matched, the second parameter identifies locations of
	4041	all right hand side elements of the rule being matched, and the third
	4042	parameter is the size of the rule's right hand side.
	4043	When a GLR parser reports an ambiguity, which of multiple candidate
	4044	right hand sides it passes to @code{YYLLOC_DEFAULT} is undefined.
	4045	When processing a syntax error, the second parameter identifies locations
	4046	of the symbols that were discarded during error processing, and the third
	4047	parameter is the number of discarded symbols.
	4048
	4049	By default, @code{YYLLOC_DEFAULT} is defined this way:
	4050
	4051	@example
	4052	@group
	4053	# define YYLLOC_DEFAULT(Cur, Rhs, N) \
	4054	do \
	4055	if (N) \
	4056	@{ \
	4057	(Cur).first_line = YYRHSLOC(Rhs, 1).first_line; \
	4058	(Cur).first_column = YYRHSLOC(Rhs, 1).first_column; \
	4059	(Cur).last_line = YYRHSLOC(Rhs, N).last_line; \
	4060	(Cur).last_column = YYRHSLOC(Rhs, N).last_column; \
	4061	@} \
	4062	else \
	4063	@{ \
	4064	(Cur).first_line = (Cur).last_line = \
	4065	YYRHSLOC(Rhs, 0).last_line; \
	4066	(Cur).first_column = (Cur).last_column = \
	4067	YYRHSLOC(Rhs, 0).last_column; \
	4068	@} \
	4069	while (0)
	4070	@end group
	4071	@end example
	4072
	4073	@noindent
	4074	where @code{YYRHSLOC (rhs, k)} is the location of the @var{k}th symbol
	4075	in @var{rhs} when @var{k} is positive, and the location of the symbol
	4076	just before the reduction when @var{k} and @var{n} are both zero.
	4077
	4078	When defining @code{YYLLOC_DEFAULT}, you should consider that:
	4079
	4080	@itemize @bullet
	4081	@item
	4082	All arguments are free of side-effects. However, only the first one (the
	4083	result) should be modified by @code{YYLLOC_DEFAULT}.
	4084
	4085	@item
	4086	For consistency with semantic actions, valid indexes within the
	4087	right hand side range from 1 to @var{n}. When @var{n} is zero, only 0 is a
	4088	valid index, and it refers to the symbol just before the reduction.
	4089	During error processing @var{n} is always positive.
	4090
	4091	@item
	4092	Your macro should parenthesize its arguments, if need be, since the
	4093	actual arguments may not be surrounded by parentheses. Also, your
	4094	macro should expand to something that can be used as a single
	4095	statement when it is followed by a semicolon.
	4096	@end itemize
	4097
	4098	@node Named References
	4099	@section Named References
	4100	@cindex named references
	4101
	4102	As described in the preceding sections, the traditional way to refer to any
	4103	semantic value or location is a @dfn{positional reference}, which takes the
	4104	form @code{$@var{n}}, @code{$$}, @code{@@@var{n}}, and @code{@@$}. However,
	4105	such a reference is not very descriptive. Moreover, if you later decide to
	4106	insert or remove symbols in the right-hand side of a grammar rule, the need
	4107	to renumber such references can be tedious and error-prone.
	4108
	4109	To avoid these issues, you can also refer to a semantic value or location
	4110	using a @dfn{named reference}. First of all, original symbol names may be
	4111	used as named references. For example:
	4112
	4113	@example
	4114	@group
	4115	invocation: op '(' args ')'
	4116	@{ $invocation = new_invocation ($op, $args, @@invocation); @}
	4117	@end group
	4118	@end example
	4119
	4120	@noindent
	4121	Positional and named references can be mixed arbitrarily. For example:
	4122
	4123	@example
	4124	@group
	4125	invocation: op '(' args ')'
	4126	@{ $$ = new_invocation ($op, $args, @@$); @}
	4127	@end group
	4128	@end example
	4129
	4130	@noindent
	4131	However, sometimes regular symbol names are not sufficient due to
	4132	ambiguities:
	4133
	4134	@example
	4135	@group
	4136	exp: exp '/' exp
	4137	@{ $exp = $exp / $exp; @} // $exp is ambiguous.
	4138
	4139	exp: exp '/' exp
	4140	@{ $$ = $1 / $exp; @} // One usage is ambiguous.
	4141
	4142	exp: exp '/' exp
	4143	@{ $$ = $1 / $3; @} // No error.
	4144	@end group
	4145	@end example
	4146
	4147	@noindent
	4148	When ambiguity occurs, explicitly declared names may be used for values and
	4149	locations. Explicit names are declared as a bracketed name after a symbol
	4150	appearance in rule definitions. For example:
	4151	@example
	4152	@group
	4153	exp[result]: exp[left] '/' exp[right]
	4154	@{ $result = $left / $right; @}
	4155	@end group
	4156	@end example
	4157
	4158	@noindent
	4159	In order to access a semantic value generated by a mid-rule action, an
	4160	explicit name may also be declared by putting a bracketed name after the
	4161	closing brace of the mid-rule action code:
	4162	@example
	4163	@group
	4164	exp[res]: exp[x] '+' @{$left = $x;@}[left] exp[right]
	4165	@{ $res = $left + $right; @}
	4166	@end group
	4167	@end example
	4168
	4169	@noindent
	4170
	4171	In references, in order to specify names containing dots and dashes, an explicit
	4172	bracketed syntax @code{$[name]} and @code{@@[name]} must be used:
	4173	@example
	4174	@group
	4175	if-stmt: "if" '(' expr ')' "then" then.stmt ';'
	4176	@{ $[if-stmt] = new_if_stmt ($expr, $[then.stmt]); @}
	4177	@end group
	4178	@end example
	4179
	4180	It often happens that named references are followed by a dot, dash or other
	4181	C punctuation marks and operators. By default, Bison will read
	4182	@samp{$name.suffix} as a reference to symbol value @code{$name} followed by
	4183	@samp{.suffix}, i.e., an access to the @code{suffix} field of the semantic
	4184	value. In order to force Bison to recognize @samp{name.suffix} in its
	4185	entirety as the name of a semantic value, the bracketed syntax
	4186	@samp{$[name.suffix]} must be used.
	4187
	4188	The named references feature is experimental. More user feedback will help
	4189	to stabilize it.
	4190
	4191	@node Declarations
	4192	@section Bison Declarations
	4193	@cindex declarations, Bison
	4194	@cindex Bison declarations
	4195
	4196	The @dfn{Bison declarations} section of a Bison grammar defines the symbols
	4197	used in formulating the grammar and the data types of semantic values.
	4198	@xref{Symbols}.
	4199
	4200	All token type names (but not single-character literal tokens such as
	4201	@code{'+'} and @code{'*'}) must be declared. Nonterminal symbols must be
	4202	declared if you need to specify which data type to use for the semantic
	4203	value (@pxref{Multiple Types, ,More Than One Value Type}).
	4204
	4205	The first rule in the grammar file also specifies the start symbol, by
	4206	default. If you want some other symbol to be the start symbol, you
	4207	must declare it explicitly (@pxref{Language and Grammar, ,Languages
	4208	and Context-Free Grammars}).
	4209
	4210	@menu
	4211	* Require Decl:: Requiring a Bison version.
	4212	* Token Decl:: Declaring terminal symbols.
	4213	* Precedence Decl:: Declaring terminals with precedence and associativity.
	4214	* Union Decl:: Declaring the set of all semantic value types.
	4215	* Type Decl:: Declaring the choice of type for a nonterminal symbol.
	4216	* Initial Action Decl:: Code run before parsing starts.
	4217	* Destructor Decl:: Declaring how symbols are freed.
	4218	* Expect Decl:: Suppressing warnings about parsing conflicts.
	4219	* Start Decl:: Specifying the start symbol.
	4220	* Pure Decl:: Requesting a reentrant parser.
	4221	* Push Decl:: Requesting a push parser.
	4222	* Decl Summary:: Table of all Bison declarations.
	4223	* %define Summary:: Defining variables to adjust Bison's behavior.
	4224	* %code Summary:: Inserting code into the parser source.
	4225	@end menu
	4226
	4227	@node Require Decl
	4228	@subsection Require a Version of Bison
	4229	@cindex version requirement
	4230	@cindex requiring a version of Bison
	4231	@findex %require
	4232
	4233	You may require the minimum version of Bison to process the grammar. If
	4234	the requirement is not met, @command{bison} exits with an error (exit
	4235	status 63).
	4236
	4237	@example
	4238	%require "@var{version}"
	4239	@end example
	4240
	4241	@node Token Decl
	4242	@subsection Token Type Names
	4243	@cindex declaring token type names
	4244	@cindex token type names, declaring
	4245	@cindex declaring literal string tokens
	4246	@findex %token
	4247
	4248	The basic way to declare a token type name (terminal symbol) is as follows:
	4249
	4250	@example
	4251	%token @var{name}
	4252	@end example
	4253
	4254	Bison will convert this into a @code{#define} directive in
	4255	the parser, so that the function @code{yylex} (if it is in this file)
	4256	can use the name @var{name} to stand for this token type's code.
	4257
	4258	Alternatively, you can use @code{%left}, @code{%right}, or
	4259	@code{%nonassoc} instead of @code{%token}, if you wish to specify
	4260	associativity and precedence. @xref{Precedence Decl, ,Operator
	4261	Precedence}.
	4262
	4263	You can explicitly specify the numeric code for a token type by appending
	4264	a nonnegative decimal or hexadecimal integer value in the field immediately
	4265	following the token name:
	4266
	4267	@example
	4268	%token NUM 300
	4269	%token XNUM 0x12d // a GNU extension
	4270	@end example
	4271
	4272	@noindent
	4273	It is generally best, however, to let Bison choose the numeric codes for
	4274	all token types. Bison will automatically select codes that don't conflict
	4275	with each other or with normal characters.
	4276
	4277	In the event that the stack type is a union, you must augment the
	4278	@code{%token} or other token declaration to include the data type
	4279	alternative delimited by angle-brackets (@pxref{Multiple Types, ,More
	4280	Than One Value Type}).
	4281
	4282	For example:
	4283
	4284	@example
	4285	@group
	4286	%union @{ /* define stack type */
	4287	double val;
	4288	symrec *tptr;
	4289	@}
	4290	%token <val> NUM /* define token NUM and its type */
	4291	@end group
	4292	@end example
	4293
	4294	You can associate a literal string token with a token type name by
	4295	writing the literal string at the end of a @code{%token}
	4296	declaration which declares the name. For example:
	4297
	4298	@example
	4299	%token arrow "=>"
	4300	@end example
	4301
	4302	@noindent
	4303	For example, a grammar for the C language might specify these names with
	4304	equivalent literal string tokens:
	4305
	4306	@example
	4307	%token <operator> OR "\|\|"
	4308	%token <operator> LE 134 "<="
	4309	%left OR "<="
	4310	@end example
	4311
	4312	@noindent
	4313	Once you equate the literal string and the token name, you can use them
	4314	interchangeably in further declarations or the grammar rules. The
	4315	@code{yylex} function can use the token name or the literal string to
	4316	obtain the token type code number (@pxref{Calling Convention}).
	4317	Syntax error messages passed to @code{yyerror} from the parser will reference
	4318	the literal string instead of the token name.
	4319
	4320	The token numbered as 0 corresponds to end of file; the following line
	4321	allows for nicer error messages referring to ``end of file'' instead
	4322	of ``$end'':
	4323
	4324	@example
	4325	%token END 0 "end of file"
	4326	@end example
	4327
	4328	@node Precedence Decl
	4329	@subsection Operator Precedence
	4330	@cindex precedence declarations
	4331	@cindex declaring operator precedence
	4332	@cindex operator precedence, declaring
	4333
	4334	Use the @code{%left}, @code{%right} or @code{%nonassoc} declaration to
	4335	declare a token and specify its precedence and associativity, all at
	4336	once. These are called @dfn{precedence declarations}.
	4337	@xref{Precedence, ,Operator Precedence}, for general information on
	4338	operator precedence.
	4339
	4340	The syntax of a precedence declaration is nearly the same as that of
	4341	@code{%token}: either
	4342
	4343	@example
	4344	%left @var{symbols}@dots{}
	4345	@end example
	4346
	4347	@noindent
	4348	or
	4349
	4350	@example
	4351	%left <@var{type}> @var{symbols}@dots{}
	4352	@end example
	4353
	4354	And indeed any of these declarations serves the purposes of @code{%token}.
	4355	But in addition, they specify the associativity and relative precedence for
	4356	all the @var{symbols}:
	4357
	4358	@itemize @bullet
	4359	@item
	4360	The associativity of an operator @var{op} determines how repeated uses
	4361	of the operator nest: whether @samp{@var{x} @var{op} @var{y} @var{op}
	4362	@var{z}} is parsed by grouping @var{x} with @var{y} first or by
	4363	grouping @var{y} with @var{z} first. @code{%left} specifies
	4364	left-associativity (grouping @var{x} with @var{y} first) and
	4365	@code{%right} specifies right-associativity (grouping @var{y} with
	4366	@var{z} first). @code{%nonassoc} specifies no associativity, which
	4367	means that @samp{@var{x} @var{op} @var{y} @var{op} @var{z}} is
	4368	considered a syntax error.
	4369
	4370	@item
	4371	The precedence of an operator determines how it nests with other operators.
	4372	All the tokens declared in a single precedence declaration have equal
	4373	precedence and nest together according to their associativity.
	4374	When two tokens declared in different precedence declarations associate,
	4375	the one declared later has the higher precedence and is grouped first.
	4376	@end itemize
	4377
	4378	For backward compatibility, there is a confusing difference between the
	4379	argument lists of @code{%token} and precedence declarations.
	4380	Only a @code{%token} can associate a literal string with a token type name.
	4381	A precedence declaration always interprets a literal string as a reference to a
	4382	separate token.
	4383	For example:
	4384
	4385	@example
	4386	%left OR "<=" // Does not declare an alias.
	4387	%left OR 134 "<=" 135 // Declares 134 for OR and 135 for "<=".
	4388	@end example
	4389
	4390	@node Union Decl
	4391	@subsection The Collection of Value Types
	4392	@cindex declaring value types
	4393	@cindex value types, declaring
	4394	@findex %union
	4395
	4396	The @code{%union} declaration specifies the entire collection of
	4397	possible data types for semantic values. The keyword @code{%union} is
	4398	followed by braced code containing the same thing that goes inside a
	4399	@code{union} in C@.
	4400
	4401	For example:
	4402
	4403	@example
	4404	@group
	4405	%union @{
	4406	double val;
	4407	symrec *tptr;
	4408	@}
	4409	@end group
	4410	@end example
	4411
	4412	@noindent
	4413	This says that the two alternative types are @code{double} and @code{symrec
	4414	*}. They are given names @code{val} and @code{tptr}; these names are used
	4415	in the @code{%token} and @code{%type} declarations to pick one of the types
	4416	for a terminal or nonterminal symbol (@pxref{Type Decl, ,Nonterminal Symbols}).
	4417
	4418	As an extension to POSIX, a tag is allowed after the
	4419	@code{union}. For example:
	4420
	4421	@example
	4422	@group
	4423	%union value @{
	4424	double val;
	4425	symrec *tptr;
	4426	@}
	4427	@end group
	4428	@end example
	4429
	4430	@noindent
	4431	specifies the union tag @code{value}, so the corresponding C type is
	4432	@code{union value}. If you do not specify a tag, it defaults to
	4433	@code{YYSTYPE}.
	4434
	4435	As another extension to POSIX, you may specify multiple
	4436	@code{%union} declarations; their contents are concatenated. However,
	4437	only the first @code{%union} declaration can specify a tag.
	4438
	4439	Note that, unlike making a @code{union} declaration in C, you need not write
	4440	a semicolon after the closing brace.
	4441
	4442	Instead of @code{%union}, you can define and use your own union type
	4443	@code{YYSTYPE} if your grammar contains at least one
	4444	@samp{<@var{type}>} tag. For example, you can put the following into
	4445	a header file @file{parser.h}:
	4446
	4447	@example
	4448	@group
	4449	union YYSTYPE @{
	4450	double val;
	4451	symrec *tptr;
	4452	@};
	4453	typedef union YYSTYPE YYSTYPE;
	4454	@end group
	4455	@end example
	4456
	4457	@noindent
	4458	and then your grammar can use the following
	4459	instead of @code{%union}:
	4460
	4461	@example
	4462	@group
	4463	%@{
	4464	#include "parser.h"
	4465	%@}
	4466	%type <val> expr
	4467	%token <tptr> ID
	4468	@end group
	4469	@end example
	4470
	4471	@node Type Decl
	4472	@subsection Nonterminal Symbols
	4473	@cindex declaring value types, nonterminals
	4474	@cindex value types, nonterminals, declaring
	4475	@findex %type
	4476
	4477	@noindent
	4478	When you use @code{%union} to specify multiple value types, you must
	4479	declare the value type of each nonterminal symbol for which values are
	4480	used. This is done with a @code{%type} declaration, like this:
	4481
	4482	@example
	4483	%type <@var{type}> @var{nonterminal}@dots{}
	4484	@end example
	4485
	4486	@noindent
	4487	Here @var{nonterminal} is the name of a nonterminal symbol, and
	4488	@var{type} is the name given in the @code{%union} to the alternative
	4489	that you want (@pxref{Union Decl, ,The Collection of Value Types}). You
	4490	can give any number of nonterminal symbols in the same @code{%type}
	4491	declaration, if they have the same value type. Use spaces to separate
	4492	the symbol names.
	4493
	4494	You can also declare the value type of a terminal symbol. To do this,
	4495	use the same @code{<@var{type}>} construction in a declaration for the
	4496	terminal symbol. All kinds of token declarations allow
	4497	@code{<@var{type}>}.
	4498
	4499	@node Initial Action Decl
	4500	@subsection Performing Actions before Parsing
	4501	@findex %initial-action
	4502
	4503	Sometimes your parser needs to perform some initializations before
	4504	parsing. The @code{%initial-action} directive allows for such arbitrary
	4505	code.
	4506
	4507	@deffn {Directive} %initial-action @{ @var{code} @}
	4508	@findex %initial-action
	4509	Declare that the braced @var{code} must be invoked before parsing each time
	4510	@code{yyparse} is called. The @var{code} may use @code{$$} and
	4511	@code{@@$} --- initial value and location of the lookahead --- and the
	4512	@code{%parse-param}.
	4513	@end deffn
	4514
	4515	For instance, if your locations use a file name, you may use
	4516
	4517	@example
	4518	%parse-param @{ char const *file_name @};
	4519	%initial-action
	4520	@{
	4521	@@$.initialize (file_name);
	4522	@};
	4523	@end example
	4524
	4525
	4526	@node Destructor Decl
	4527	@subsection Freeing Discarded Symbols
	4528	@cindex freeing discarded symbols
	4529	@findex %destructor
	4530	@findex <*>
	4531	@findex <>
	4532	During error recovery (@pxref{Error Recovery}), symbols already pushed
	4533	on the stack and tokens coming from the rest of the file are discarded
	4534	until the parser falls on its feet. If the parser runs out of memory,
	4535	or if it returns via @code{YYABORT} or @code{YYACCEPT}, all the
	4536	symbols on the stack must be discarded. Even if the parser succeeds, it
	4537	must discard the start symbol.
	4538
	4539	When discarded symbols convey heap based information, this memory is
	4540	lost. While this behavior can be tolerable for batch parsers, such as
	4541	in traditional compilers, it is unacceptable for programs like shells or
	4542	protocol implementations that may parse and execute indefinitely.
	4543
	4544	The @code{%destructor} directive defines code that is called when a
	4545	symbol is automatically discarded.
	4546
	4547	@deffn {Directive} %destructor @{ @var{code} @} @var{symbols}
	4548	@findex %destructor
	4549	Invoke the braced @var{code} whenever the parser discards one of the
	4550	@var{symbols}.
	4551	Within @var{code}, @code{$$} designates the semantic value associated
	4552	with the discarded symbol, and @code{@@$} designates its location.
	4553	The additional parser parameters are also available (@pxref{Parser Function, ,
	4554	The Parser Function @code{yyparse}}).
	4555
	4556	When a symbol is listed among @var{symbols}, its @code{%destructor} is called a
	4557	per-symbol @code{%destructor}.
	4558	You may also define a per-type @code{%destructor} by listing a semantic type
	4559	tag among @var{symbols}.
	4560	In that case, the parser will invoke this @var{code} whenever it discards any
	4561	grammar symbol that has that semantic type tag unless that symbol has its own
	4562	per-symbol @code{%destructor}.
	4563
	4564	Finally, you can define two different kinds of default @code{%destructor}s.
	4565	(These default forms are experimental.
	4566	More user feedback will help to determine whether they should become permanent
	4567	features.)
	4568	You can place each of @code{<*>} and @code{<>} in the @var{symbols} list of
	4569	exactly one @code{%destructor} declaration in your grammar file.
	4570	The parser will invoke the @var{code} associated with one of these whenever it
	4571	discards any user-defined grammar symbol that has no per-symbol and no per-type
	4572	@code{%destructor}.
	4573	The parser uses the @var{code} for @code{<*>} in the case of such a grammar
	4574	symbol for which you have formally declared a semantic type tag (@code{%type}
	4575	counts as such a declaration, but @code{$<tag>$} does not).
	4576	The parser uses the @var{code} for @code{<>} in the case of such a grammar
	4577	symbol that has no declared semantic type tag.
	4578	@end deffn
	4579
	4580	@noindent
	4581	For example:
	4582
	4583	@example
	4584	%union @{ char *string; @}
	4585	%token <string> STRING1
	4586	%token <string> STRING2
	4587	%type <string> string1
	4588	%type <string> string2
	4589	%union @{ char character; @}
	4590	%token <character> CHR
	4591	%type <character> chr
	4592	%token TAGLESS
	4593
	4594	%destructor @{ @} <character>
	4595	%destructor @{ free ($$); @} <*>
	4596	%destructor @{ free ($$); printf ("%d", @@$.first_line); @} STRING1 string1
	4597	%destructor @{ printf ("Discarding tagless symbol.\n"); @} <>
	4598	@end example
	4599
	4600	@noindent
	4601	guarantees that, when the parser discards any user-defined symbol that has a
	4602	semantic type tag other than @code{<character>}, it passes its semantic value
	4603	to @code{free} by default.
	4604	However, when the parser discards a @code{STRING1} or a @code{string1}, it also
	4605	prints its line number to @code{stdout}.
	4606	It performs only the second @code{%destructor} in this case, so it invokes
	4607	@code{free} only once.
	4608	Finally, the parser merely prints a message whenever it discards any symbol,
	4609	such as @code{TAGLESS}, that has no semantic type tag.
	4610
	4611	A Bison-generated parser invokes the default @code{%destructor}s only for
	4612	user-defined as opposed to Bison-defined symbols.
	4613	For example, the parser will not invoke either kind of default
	4614	@code{%destructor} for the special Bison-defined symbols @code{$accept},
	4615	@code{$undefined}, or @code{$end} (@pxref{Table of Symbols, ,Bison Symbols}),
	4616	none of which you can reference in your grammar.
	4617	It also will not invoke either for the @code{error} token (@pxref{Table of
	4618	Symbols, ,error}), which is always defined by Bison regardless of whether you
	4619	reference it in your grammar.
	4620	However, it may invoke one of them for the end token (token 0) if you
	4621	redefine it from @code{$end} to, for example, @code{END}:
	4622
	4623	@example
	4624	%token END 0
	4625	@end example
	4626
	4627	@cindex actions in mid-rule
	4628	@cindex mid-rule actions
	4629	Finally, Bison will never invoke a @code{%destructor} for an unreferenced
	4630	mid-rule semantic value (@pxref{Mid-Rule Actions,,Actions in Mid-Rule}).
	4631	That is, Bison does not consider a mid-rule to have a semantic value if you
	4632	do not reference @code{$$} in the mid-rule's action or @code{$@var{n}}
	4633	(where @var{n} is the right-hand side symbol position of the mid-rule) in
	4634	any later action in that rule. However, if you do reference either, the
	4635	Bison-generated parser will invoke the @code{<>} @code{%destructor} whenever
	4636	it discards the mid-rule symbol.
	4637
	4638	@ignore
	4639	@noindent
	4640	In the future, it may be possible to redefine the @code{error} token as a
	4641	nonterminal that captures the discarded symbols.
	4642	In that case, the parser will invoke the default destructor for it as well.
	4643	@end ignore
	4644
	4645	@sp 1
	4646
	4647	@cindex discarded symbols
	4648	@dfn{Discarded symbols} are the following:
	4649
	4650	@itemize
	4651	@item
	4652	stacked symbols popped during the first phase of error recovery,
	4653	@item
	4654	incoming terminals during the second phase of error recovery,
	4655	@item
	4656	the current lookahead and the entire stack (except the current
	4657	right-hand side symbols) when the parser returns immediately, and
	4658	@item
	4659	the start symbol, when the parser succeeds.
	4660	@end itemize
	4661
	4662	The parser can @dfn{return immediately} because of an explicit call to
	4663	@code{YYABORT} or @code{YYACCEPT}, or failed error recovery, or memory
	4664	exhaustion.
	4665
	4666	Right-hand side symbols of a rule that explicitly triggers a syntax
	4667	error via @code{YYERROR} are not discarded automatically. As a rule
	4668	of thumb, destructors are invoked only when user actions cannot manage
	4669	the memory.
	4670
	4671	@node Expect Decl
	4672	@subsection Suppressing Conflict Warnings
	4673	@cindex suppressing conflict warnings
	4674	@cindex preventing warnings about conflicts
	4675	@cindex warnings, preventing
	4676	@cindex conflicts, suppressing warnings of
	4677	@findex %expect
	4678	@findex %expect-rr
	4679
	4680	Bison normally warns if there are any conflicts in the grammar
	4681	(@pxref{Shift/Reduce, ,Shift/Reduce Conflicts}), but most real grammars
	4682	have harmless shift/reduce conflicts which are resolved in a predictable
	4683	way and would be difficult to eliminate. It is desirable to suppress
	4684	the warning about these conflicts unless the number of conflicts
	4685	changes. You can do this with the @code{%expect} declaration.
	4686
	4687	The declaration looks like this:
	4688
	4689	@example
	4690	%expect @var{n}
	4691	@end example
	4692
	4693	Here @var{n} is a decimal integer. The declaration says there should
	4694	be @var{n} shift/reduce conflicts and no reduce/reduce conflicts.
	4695	Bison reports an error if the number of shift/reduce conflicts differs
	4696	from @var{n}, or if there are any reduce/reduce conflicts.
	4697
	4698	For deterministic parsers, reduce/reduce conflicts are more
	4699	serious, and should be eliminated entirely. Bison will always report
	4700	reduce/reduce conflicts for these parsers. With GLR
	4701	parsers, however, both kinds of conflicts are routine; otherwise,
	4702	there would be no need to use GLR parsing. Therefore, it is
	4703	also possible to specify an expected number of reduce/reduce conflicts
	4704	in GLR parsers, using the declaration:
	4705
	4706	@example
	4707	%expect-rr @var{n}
	4708	@end example
	4709
	4710	In general, using @code{%expect} involves these steps:
	4711
	4712	@itemize @bullet
	4713	@item
	4714	Compile your grammar without @code{%expect}. Use the @samp{-v} option
	4715	to get a verbose list of where the conflicts occur. Bison will also
	4716	print the number of conflicts.
	4717
	4718	@item
	4719	Check each of the conflicts to make sure that Bison's default
	4720	resolution is what you really want. If not, rewrite the grammar and
	4721	go back to the beginning.
	4722
	4723	@item
	4724	Add an @code{%expect} declaration, copying the number @var{n} from the
	4725	number which Bison printed. With GLR parsers, add an
	4726	@code{%expect-rr} declaration as well.
	4727	@end itemize
	4728
	4729	Now Bison will report an error if you introduce an unexpected conflict,
	4730	but will keep silent otherwise.
	4731
	4732	@node Start Decl
	4733	@subsection The Start-Symbol
	4734	@cindex declaring the start symbol
	4735	@cindex start symbol, declaring
	4736	@cindex default start symbol
	4737	@findex %start
	4738
	4739	Bison assumes by default that the start symbol for the grammar is the first
	4740	nonterminal specified in the grammar specification section. The programmer
	4741	may override this restriction with the @code{%start} declaration as follows:
	4742
	4743	@example
	4744	%start @var{symbol}
	4745	@end example
	4746
	4747	@node Pure Decl
	4748	@subsection A Pure (Reentrant) Parser
	4749	@cindex reentrant parser
	4750	@cindex pure parser
	4751	@findex %define api.pure
	4752
	4753	A @dfn{reentrant} program is one which does not alter in the course of
	4754	execution; in other words, it consists entirely of @dfn{pure} (read-only)
	4755	code. Reentrancy is important whenever asynchronous execution is possible;
	4756	for example, a nonreentrant program may not be safe to call from a signal
	4757	handler. In systems with multiple threads of control, a nonreentrant
	4758	program must be called only within interlocks.
	4759
	4760	Normally, Bison generates a parser which is not reentrant. This is
	4761	suitable for most uses, and it permits compatibility with Yacc. (The
	4762	standard Yacc interfaces are inherently nonreentrant, because they use
	4763	statically allocated variables for communication with @code{yylex},
	4764	including @code{yylval} and @code{yylloc}.)
	4765
	4766	Alternatively, you can generate a pure, reentrant parser. The Bison
	4767	declaration @code{%define api.pure} says that you want the parser to be
	4768	reentrant. It looks like this:
	4769
	4770	@example
	4771	%define api.pure
	4772	@end example
	4773
	4774	The result is that the communication variables @code{yylval} and
	4775	@code{yylloc} become local variables in @code{yyparse}, and a different
	4776	calling convention is used for the lexical analyzer function
	4777	@code{yylex}. @xref{Pure Calling, ,Calling Conventions for Pure
	4778	Parsers}, for the details of this. The variable @code{yynerrs}
	4779	becomes local in @code{yyparse} in pull mode but it becomes a member
	4780	of yypstate in push mode. (@pxref{Error Reporting, ,The Error
	4781	Reporting Function @code{yyerror}}). The convention for calling
	4782	@code{yyparse} itself is unchanged.
	4783
	4784	Whether the parser is pure has nothing to do with the grammar rules.
	4785	You can generate either a pure parser or a nonreentrant parser from any
	4786	valid grammar.
	4787
	4788	@node Push Decl
	4789	@subsection A Push Parser
	4790	@cindex push parser
	4791	@cindex push parser
	4792	@findex %define api.push-pull
	4793
	4794	(The current push parsing interface is experimental and may evolve.
	4795	More user feedback will help to stabilize it.)
	4796
	4797	A pull parser is called once and it takes control until all its input
	4798	is completely parsed. A push parser, on the other hand, is called
	4799	each time a new token is made available.
	4800
	4801	A push parser is typically useful when the parser is part of a
	4802	main event loop in the client's application. This is typically
	4803	a requirement of a GUI, when the main event loop needs to be triggered
	4804	within a certain time period.
	4805
	4806	Normally, Bison generates a pull parser.
	4807	The following Bison declaration says that you want the parser to be a push
	4808	parser (@pxref{%define Summary,,api.push-pull}):
	4809
	4810	@example
	4811	%define api.push-pull push
	4812	@end example
	4813
	4814	In almost all cases, you want to ensure that your push parser is also
	4815	a pure parser (@pxref{Pure Decl, ,A Pure (Reentrant) Parser}). The only
	4816	time you should create an impure push parser is to have backwards
	4817	compatibility with the impure Yacc pull mode interface. Unless you know
	4818	what you are doing, your declarations should look like this:
	4819
	4820	@example
	4821	%define api.pure
	4822	%define api.push-pull push
	4823	@end example
	4824
	4825	There is a major notable functional difference between the pure push parser
	4826	and the impure push parser. It is acceptable for a pure push parser to have
	4827	many parser instances, of the same type of parser, in memory at the same time.
	4828	An impure push parser should only use one parser at a time.
	4829
	4830	When a push parser is selected, Bison will generate some new symbols in
	4831	the generated parser. @code{yypstate} is a structure that the generated
	4832	parser uses to store the parser's state. @code{yypstate_new} is the
	4833	function that will create a new parser instance. @code{yypstate_delete}
	4834	will free the resources associated with the corresponding parser instance.
	4835	Finally, @code{yypush_parse} is the function that should be called whenever a
	4836	token is available to provide the parser. A trivial example
	4837	of using a pure push parser would look like this:
	4838
	4839	@example
	4840	int status;
	4841	yypstate *ps = yypstate_new ();
	4842	do @{
	4843	status = yypush_parse (ps, yylex (), NULL);
	4844	@} while (status == YYPUSH_MORE);
	4845	yypstate_delete (ps);
	4846	@end example
	4847
	4848	If the user decided to use an impure push parser, a few things about
	4849	the generated parser will change. The @code{yychar} variable becomes
	4850	a global variable instead of a variable in the @code{yypush_parse} function.
	4851	For this reason, the signature of the @code{yypush_parse} function is
	4852	changed to remove the token as a parameter. A nonreentrant push parser
	4853	example would thus look like this:
	4854
	4855	@example
	4856	extern int yychar;
	4857	int status;
	4858	yypstate *ps = yypstate_new ();
	4859	do @{
	4860	yychar = yylex ();
	4861	status = yypush_parse (ps);
	4862	@} while (status == YYPUSH_MORE);
	4863	yypstate_delete (ps);
	4864	@end example
	4865
	4866	That's it. Notice the next token is put into the global variable @code{yychar}
	4867	for use by the next invocation of the @code{yypush_parse} function.
	4868
	4869	Bison also supports both the push parser interface along with the pull parser
	4870	interface in the same generated parser. In order to get this functionality,
	4871	you should replace the @code{%define api.push-pull push} declaration with the
	4872	@code{%define api.push-pull both} declaration. Doing this will create all of
	4873	the symbols mentioned earlier along with the two extra symbols, @code{yyparse}
	4874	and @code{yypull_parse}. @code{yyparse} can be used exactly as it normally
	4875	would be used. However, the user should note that it is implemented in the
	4876	generated parser by calling @code{yypull_parse}.
	4877	This makes the @code{yyparse} function that is generated with the
	4878	@code{%define api.push-pull both} declaration slower than the normal
	4879	@code{yyparse} function. If the user
	4880	calls the @code{yypull_parse} function it will parse the rest of the input
	4881	stream. It is possible to @code{yypush_parse} tokens to select a subgrammar
	4882	and then @code{yypull_parse} the rest of the input stream. If you would like
	4883	to switch back and forth between between parsing styles, you would have to
	4884	write your own @code{yypull_parse} function that knows when to quit looking
	4885	for input. An example of using the @code{yypull_parse} function would look
	4886	like this:
	4887
	4888	@example
	4889	yypstate *ps = yypstate_new ();
	4890	yypull_parse (ps); /* Will call the lexer */
	4891	yypstate_delete (ps);
	4892	@end example
	4893
	4894	Adding the @code{%define api.pure} declaration does exactly the same thing to
	4895	the generated parser with @code{%define api.push-pull both} as it did for
	4896	@code{%define api.push-pull push}.
	4897
	4898	@node Decl Summary
	4899	@subsection Bison Declaration Summary
	4900	@cindex Bison declaration summary
	4901	@cindex declaration summary
	4902	@cindex summary, Bison declaration
	4903
	4904	Here is a summary of the declarations used to define a grammar:
	4905
	4906	@deffn {Directive} %union
	4907	Declare the collection of data types that semantic values may have
	4908	(@pxref{Union Decl, ,The Collection of Value Types}).
	4909	@end deffn
	4910
	4911	@deffn {Directive} %token
	4912	Declare a terminal symbol (token type name) with no precedence
	4913	or associativity specified (@pxref{Token Decl, ,Token Type Names}).
	4914	@end deffn
	4915
	4916	@deffn {Directive} %right
	4917	Declare a terminal symbol (token type name) that is right-associative
	4918	(@pxref{Precedence Decl, ,Operator Precedence}).
	4919	@end deffn
	4920
	4921	@deffn {Directive} %left
	4922	Declare a terminal symbol (token type name) that is left-associative
	4923	(@pxref{Precedence Decl, ,Operator Precedence}).
	4924	@end deffn
	4925
	4926	@deffn {Directive} %nonassoc
	4927	Declare a terminal symbol (token type name) that is nonassociative
	4928	(@pxref{Precedence Decl, ,Operator Precedence}).
	4929	Using it in a way that would be associative is a syntax error.
	4930	@end deffn
	4931
	4932	@ifset defaultprec
	4933	@deffn {Directive} %default-prec
	4934	Assign a precedence to rules lacking an explicit @code{%prec} modifier
	4935	(@pxref{Contextual Precedence, ,Context-Dependent Precedence}).
	4936	@end deffn
	4937	@end ifset
	4938
	4939	@deffn {Directive} %type
	4940	Declare the type of semantic values for a nonterminal symbol
	4941	(@pxref{Type Decl, ,Nonterminal Symbols}).
	4942	@end deffn
	4943
	4944	@deffn {Directive} %start
	4945	Specify the grammar's start symbol (@pxref{Start Decl, ,The
	4946	Start-Symbol}).
	4947	@end deffn
	4948
	4949	@deffn {Directive} %expect
	4950	Declare the expected number of shift-reduce conflicts
	4951	(@pxref{Expect Decl, ,Suppressing Conflict Warnings}).
	4952	@end deffn
	4953
	4954
	4955	@sp 1
	4956	@noindent
	4957	In order to change the behavior of @command{bison}, use the following
	4958	directives:
	4959
	4960	@deffn {Directive} %code @{@var{code}@}
	4961	@deffnx {Directive} %code @var{qualifier} @{@var{code}@}
	4962	@findex %code
	4963	Insert @var{code} verbatim into the output parser source at the
	4964	default location or at the location specified by @var{qualifier}.
	4965	@xref{%code Summary}.
	4966	@end deffn
	4967
	4968	@deffn {Directive} %debug
	4969	In the parser implementation file, define the macro @code{YYDEBUG} to
	4970	1 if it is not already defined, so that the debugging facilities are
	4971	compiled. @xref{Tracing, ,Tracing Your Parser}.
	4972	@end deffn
	4973
	4974	@deffn {Directive} %define @var{variable}
	4975	@deffnx {Directive} %define @var{variable} @var{value}
	4976	@deffnx {Directive} %define @var{variable} "@var{value}"
	4977	Define a variable to adjust Bison's behavior. @xref{%define Summary}.
	4978	@end deffn
	4979
	4980	@deffn {Directive} %defines
	4981	Write a parser header file containing macro definitions for the token
	4982	type names defined in the grammar as well as a few other declarations.
	4983	If the parser implementation file is named @file{@var{name}.c} then
	4984	the parser header file is named @file{@var{name}.h}.
	4985
	4986	For C parsers, the parser header file declares @code{YYSTYPE} unless
	4987	@code{YYSTYPE} is already defined as a macro or you have used a
	4988	@code{<@var{type}>} tag without using @code{%union}. Therefore, if
	4989	you are using a @code{%union} (@pxref{Multiple Types, ,More Than One
	4990	Value Type}) with components that require other definitions, or if you
	4991	have defined a @code{YYSTYPE} macro or type definition (@pxref{Value
	4992	Type, ,Data Types of Semantic Values}), you need to arrange for these
	4993	definitions to be propagated to all modules, e.g., by putting them in
	4994	a prerequisite header that is included both by your parser and by any
	4995	other module that needs @code{YYSTYPE}.
	4996
	4997	Unless your parser is pure, the parser header file declares
	4998	@code{yylval} as an external variable. @xref{Pure Decl, ,A Pure
	4999	(Reentrant) Parser}.
	5000
	5001	If you have also used locations, the parser header file declares
	5002	@code{YYLTYPE} and @code{yylloc} using a protocol similar to that of the
	5003	@code{YYSTYPE} macro and @code{yylval}. @xref{Tracking Locations}.
	5004
	5005	This parser header file is normally essential if you wish to put the
	5006	definition of @code{yylex} in a separate source file, because
	5007	@code{yylex} typically needs to be able to refer to the
	5008	above-mentioned declarations and to the token type codes. @xref{Token
	5009	Values, ,Semantic Values of Tokens}.
	5010
	5011	@findex %code requires
	5012	@findex %code provides
	5013	If you have declared @code{%code requires} or @code{%code provides}, the output
	5014	header also contains their code.
	5015	@xref{%code Summary}.
	5016	@end deffn
	5017
	5018	@deffn {Directive} %defines @var{defines-file}
	5019	Same as above, but save in the file @var{defines-file}.
	5020	@end deffn
	5021
	5022	@deffn {Directive} %destructor
	5023	Specify how the parser should reclaim the memory associated to
	5024	discarded symbols. @xref{Destructor Decl, , Freeing Discarded Symbols}.
	5025	@end deffn
	5026
	5027	@deffn {Directive} %file-prefix "@var{prefix}"
	5028	Specify a prefix to use for all Bison output file names. The names
	5029	are chosen as if the grammar file were named @file{@var{prefix}.y}.
	5030	@end deffn
	5031
	5032	@deffn {Directive} %language "@var{language}"
	5033	Specify the programming language for the generated parser. Currently
	5034	supported languages include C, C++, and Java.
	5035	@var{language} is case-insensitive.
	5036
	5037	This directive is experimental and its effect may be modified in future
	5038	releases.
	5039	@end deffn
	5040
	5041	@deffn {Directive} %locations
	5042	Generate the code processing the locations (@pxref{Action Features,
	5043	,Special Features for Use in Actions}). This mode is enabled as soon as
	5044	the grammar uses the special @samp{@@@var{n}} tokens, but if your
	5045	grammar does not use it, using @samp{%locations} allows for more
	5046	accurate syntax error messages.
	5047	@end deffn
	5048
	5049	@deffn {Directive} %name-prefix "@var{prefix}"
	5050	Rename the external symbols used in the parser so that they start with
	5051	@var{prefix} instead of @samp{yy}. The precise list of symbols renamed
	5052	in C parsers
	5053	is @code{yyparse}, @code{yylex}, @code{yyerror}, @code{yynerrs},
	5054	@code{yylval}, @code{yychar}, @code{yydebug}, and
	5055	(if locations are used) @code{yylloc}. If you use a push parser,
	5056	@code{yypush_parse}, @code{yypull_parse}, @code{yypstate},
	5057	@code{yypstate_new} and @code{yypstate_delete} will
	5058	also be renamed. For example, if you use @samp{%name-prefix "c_"}, the
	5059	names become @code{c_parse}, @code{c_lex}, and so on.
	5060	For C++ parsers, see the @code{%define namespace} documentation in this
	5061	section.
	5062	@xref{Multiple Parsers, ,Multiple Parsers in the Same Program}.
	5063	@end deffn
	5064
	5065	@ifset defaultprec
	5066	@deffn {Directive} %no-default-prec
	5067	Do not assign a precedence to rules lacking an explicit @code{%prec}
	5068	modifier (@pxref{Contextual Precedence, ,Context-Dependent
	5069	Precedence}).
	5070	@end deffn
	5071	@end ifset
	5072
	5073	@deffn {Directive} %no-lines
	5074	Don't generate any @code{#line} preprocessor commands in the parser
	5075	implementation file. Ordinarily Bison writes these commands in the
	5076	parser implementation file so that the C compiler and debuggers will
	5077	associate errors and object code with your source file (the grammar
	5078	file). This directive causes them to associate errors with the parser
	5079	implementation file, treating it as an independent source file in its
	5080	own right.
	5081	@end deffn
	5082
	5083	@deffn {Directive} %output "@var{file}"
	5084	Specify @var{file} for the parser implementation file.
	5085	@end deffn
	5086
	5087	@deffn {Directive} %pure-parser
	5088	Deprecated version of @code{%define api.pure} (@pxref{%define
	5089	Summary,,api.pure}), for which Bison is more careful to warn about
	5090	unreasonable usage.
	5091	@end deffn
	5092
	5093	@deffn {Directive} %require "@var{version}"
	5094	Require version @var{version} or higher of Bison. @xref{Require Decl, ,
	5095	Require a Version of Bison}.
	5096	@end deffn
	5097
	5098	@deffn {Directive} %skeleton "@var{file}"
	5099	Specify the skeleton to use.
	5100
	5101	@c You probably don't need this option unless you are developing Bison.
	5102	@c You should use @code{%language} if you want to specify the skeleton for a
	5103	@c different language, because it is clearer and because it will always choose the
	5104	@c correct skeleton for non-deterministic or push parsers.
	5105
	5106	If @var{file} does not contain a @code{/}, @var{file} is the name of a skeleton
	5107	file in the Bison installation directory.
	5108	If it does, @var{file} is an absolute file name or a file name relative to the
	5109	directory of the grammar file.
	5110	This is similar to how most shells resolve commands.
	5111	@end deffn
	5112
	5113	@deffn {Directive} %token-table
	5114	Generate an array of token names in the parser implementation file.
	5115	The name of the array is @code{yytname}; @code{yytname[@var{i}]} is
	5116	the name of the token whose internal Bison token code number is
	5117	@var{i}. The first three elements of @code{yytname} correspond to the
	5118	predefined tokens @code{"$end"}, @code{"error"}, and
	5119	@code{"$undefined"}; after these come the symbols defined in the
	5120	grammar file.
	5121
	5122	The name in the table includes all the characters needed to represent
	5123	the token in Bison. For single-character literals and literal
	5124	strings, this includes the surrounding quoting characters and any
	5125	escape sequences. For example, the Bison single-character literal
	5126	@code{'+'} corresponds to a three-character name, represented in C as
	5127	@code{"'+'"}; and the Bison two-character literal string @code{"\\/"}
	5128	corresponds to a five-character name, represented in C as
	5129	@code{"\"\\\\/\""}.
	5130
	5131	When you specify @code{%token-table}, Bison also generates macro
	5132	definitions for macros @code{YYNTOKENS}, @code{YYNNTS}, and
	5133	@code{YYNRULES}, and @code{YYNSTATES}:
	5134
	5135	@table @code
	5136	@item YYNTOKENS
	5137	The highest token number, plus one.
	5138	@item YYNNTS
	5139	The number of nonterminal symbols.
	5140	@item YYNRULES
	5141	The number of grammar rules,
	5142	@item YYNSTATES
	5143	The number of parser states (@pxref{Parser States}).
	5144	@end table
	5145	@end deffn
	5146
	5147	@deffn {Directive} %verbose
	5148	Write an extra output file containing verbose descriptions of the
	5149	parser states and what is done for each type of lookahead token in
	5150	that state. @xref{Understanding, , Understanding Your Parser}, for more
	5151	information.
	5152	@end deffn
	5153
	5154	@deffn {Directive} %yacc
	5155	Pretend the option @option{--yacc} was given, i.e., imitate Yacc,
	5156	including its naming conventions. @xref{Bison Options}, for more.
	5157	@end deffn
	5158
	5159
	5160	@node %define Summary
	5161	@subsection %define Summary
	5162
	5163	There are many features of Bison's behavior that can be controlled by
	5164	assigning the feature a single value. For historical reasons, some
	5165	such features are assigned values by dedicated directives, such as
	5166	@code{%start}, which assigns the start symbol. However, newer such
	5167	features are associated with variables, which are assigned by the
	5168	@code{%define} directive:
	5169
	5170	@deffn {Directive} %define @var{variable}
	5171	@deffnx {Directive} %define @var{variable} @var{value}
	5172	@deffnx {Directive} %define @var{variable} "@var{value}"
	5173	Define @var{variable} to @var{value}.
	5174
	5175	@var{value} must be placed in quotation marks if it contains any
	5176	character other than a letter, underscore, period, or non-initial dash
	5177	or digit. Omitting @code{"@var{value}"} entirely is always equivalent
	5178	to specifying @code{""}.
	5179
	5180	It is an error if a @var{variable} is defined by @code{%define}
	5181	multiple times, but see @ref{Bison Options,,-D
	5182	@var{name}[=@var{value}]}.
	5183	@end deffn
	5184
	5185	The rest of this section summarizes variables and values that
	5186	@code{%define} accepts.
	5187
	5188	Some @var{variable}s take Boolean values. In this case, Bison will
	5189	complain if the variable definition does not meet one of the following
	5190	four conditions:
	5191
	5192	@enumerate
	5193	@item @code{@var{value}} is @code{true}
	5194
	5195	@item @code{@var{value}} is omitted (or @code{""} is specified).
	5196	This is equivalent to @code{true}.
	5197
	5198	@item @code{@var{value}} is @code{false}.
	5199
	5200	@item @var{variable} is never defined.
	5201	In this case, Bison selects a default value.
	5202	@end enumerate
	5203
	5204	What @var{variable}s are accepted, as well as their meanings and default
	5205	values, depend on the selected target language and/or the parser
	5206	skeleton (@pxref{Decl Summary,,%language}, @pxref{Decl
	5207	Summary,,%skeleton}).
	5208	Unaccepted @var{variable}s produce an error.
	5209	Some of the accepted @var{variable}s are:
	5210
	5211	@itemize @bullet
	5212	@c ================================================== api.pure
	5213	@item api.pure
	5214	@findex %define api.pure
	5215
	5216	@itemize @bullet
	5217	@item Language(s): C
	5218
	5219	@item Purpose: Request a pure (reentrant) parser program.
	5220	@xref{Pure Decl, ,A Pure (Reentrant) Parser}.
	5221
	5222	@item Accepted Values: Boolean
	5223
	5224	@item Default Value: @code{false}
	5225	@end itemize
	5226
	5227	@item api.push-pull
	5228	@findex %define api.push-pull
	5229
	5230	@itemize @bullet
	5231	@item Language(s): C (deterministic parsers only)
	5232
	5233	@item Purpose: Request a pull parser, a push parser, or both.
	5234	@xref{Push Decl, ,A Push Parser}.
	5235	(The current push parsing interface is experimental and may evolve.
	5236	More user feedback will help to stabilize it.)
	5237
	5238	@item Accepted Values: @code{pull}, @code{push}, @code{both}
	5239
	5240	@item Default Value: @code{pull}
	5241	@end itemize
	5242
	5243	@c ================================================== lr.default-reductions
	5244
	5245	@item lr.default-reductions
	5246	@findex %define lr.default-reductions
	5247
	5248	@itemize @bullet
	5249	@item Language(s): all
	5250
	5251	@item Purpose: Specify the kind of states that are permitted to
	5252	contain default reductions. @xref{Default Reductions}. (The ability to
	5253	specify where default reductions should be used is experimental. More user
	5254	feedback will help to stabilize it.)
	5255
	5256	@item Accepted Values: @code{most}, @code{consistent}, @code{accepting}
	5257	@item Default Value:
	5258	@itemize
	5259	@item @code{accepting} if @code{lr.type} is @code{canonical-lr}.
	5260	@item @code{most} otherwise.
	5261	@end itemize
	5262	@end itemize
	5263
	5264	@c ============================================ lr.keep-unreachable-states
	5265
	5266	@item lr.keep-unreachable-states
	5267	@findex %define lr.keep-unreachable-states
	5268
	5269	@itemize @bullet
	5270	@item Language(s): all
	5271	@item Purpose: Request that Bison allow unreachable parser states to
	5272	remain in the parser tables. @xref{Unreachable States}.
	5273	@item Accepted Values: Boolean
	5274	@item Default Value: @code{false}
	5275	@end itemize
	5276
	5277	@c ================================================== lr.type
	5278
	5279	@item lr.type
	5280	@findex %define lr.type
	5281
	5282	@itemize @bullet
	5283	@item Language(s): all
	5284
	5285	@item Purpose: Specify the type of parser tables within the
	5286	LR(1) family. @xref{LR Table Construction}. (This feature is experimental.
	5287	More user feedback will help to stabilize it.)
	5288
	5289	@item Accepted Values: @code{lalr}, @code{ielr}, @code{canonical-lr}
	5290
	5291	@item Default Value: @code{lalr}
	5292	@end itemize
	5293
	5294	@item namespace
	5295	@findex %define namespace
	5296
	5297	@itemize
	5298	@item Languages(s): C++
	5299
	5300	@item Purpose: Specify the namespace for the parser class.
	5301	For example, if you specify:
	5302
	5303	@smallexample
	5304	%define namespace "foo::bar"
	5305	@end smallexample
	5306
	5307	Bison uses @code{foo::bar} verbatim in references such as:
	5308
	5309	@smallexample
	5310	foo::bar::parser::semantic_type
	5311	@end smallexample
	5312
	5313	However, to open a namespace, Bison removes any leading @code{::} and then
	5314	splits on any remaining occurrences:
	5315
	5316	@smallexample
	5317	namespace foo @{ namespace bar @{
	5318	class position;
	5319	class location;
	5320	@} @}
	5321	@end smallexample
	5322
	5323	@item Accepted Values: Any absolute or relative C++ namespace reference without
	5324	a trailing @code{"::"}.
	5325	For example, @code{"foo"} or @code{"::foo::bar"}.
	5326
	5327	@item Default Value: The value specified by @code{%name-prefix}, which defaults
	5328	to @code{yy}.
	5329	This usage of @code{%name-prefix} is for backward compatibility and can be
	5330	confusing since @code{%name-prefix} also specifies the textual prefix for the
	5331	lexical analyzer function.
	5332	Thus, if you specify @code{%name-prefix}, it is best to also specify
	5333	@code{%define namespace} so that @code{%name-prefix} @emph{only} affects the
	5334	lexical analyzer function.
	5335	For example, if you specify:
	5336
	5337	@smallexample
	5338	%define namespace "foo"
	5339	%name-prefix "bar::"
	5340	@end smallexample
	5341
	5342	The parser namespace is @code{foo} and @code{yylex} is referenced as
	5343	@code{bar::lex}.
	5344	@end itemize
	5345
	5346	@c ================================================== parse.lac
	5347	@item parse.lac
	5348	@findex %define parse.lac
	5349
	5350	@itemize
	5351	@item Languages(s): C (deterministic parsers only)
	5352
	5353	@item Purpose: Enable LAC (lookahead correction) to improve
	5354	syntax error handling. @xref{LAC}.
	5355	@item Accepted Values: @code{none}, @code{full}
	5356	@item Default Value: @code{none}
	5357	@end itemize
	5358	@end itemize
	5359
	5360
	5361	@node %code Summary
	5362	@subsection %code Summary
	5363	@findex %code
	5364	@cindex Prologue
	5365
	5366	The @code{%code} directive inserts code verbatim into the output
	5367	parser source at any of a predefined set of locations. It thus serves
	5368	as a flexible and user-friendly alternative to the traditional Yacc
	5369	prologue, @code{%@{@var{code}%@}}. This section summarizes the
	5370	functionality of @code{%code} for the various target languages
	5371	supported by Bison. For a detailed discussion of how to use
	5372	@code{%code} in place of @code{%@{@var{code}%@}} for C/C++ and why it
	5373	is advantageous to do so, @pxref{Prologue Alternatives}.
	5374
	5375	@deffn {Directive} %code @{@var{code}@}
	5376	This is the unqualified form of the @code{%code} directive. It
	5377	inserts @var{code} verbatim at a language-dependent default location
	5378	in the parser implementation.
	5379
	5380	For C/C++, the default location is the parser implementation file
	5381	after the usual contents of the parser header file. Thus, the
	5382	unqualified form replaces @code{%@{@var{code}%@}} for most purposes.
	5383
	5384	For Java, the default location is inside the parser class.
	5385	@end deffn
	5386
	5387	@deffn {Directive} %code @var{qualifier} @{@var{code}@}
	5388	This is the qualified form of the @code{%code} directive.
	5389	@var{qualifier} identifies the purpose of @var{code} and thus the
	5390	location(s) where Bison should insert it. That is, if you need to
	5391	specify location-sensitive @var{code} that does not belong at the
	5392	default location selected by the unqualified @code{%code} form, use
	5393	this form instead.
	5394	@end deffn
	5395
	5396	For any particular qualifier or for the unqualified form, if there are
	5397	multiple occurrences of the @code{%code} directive, Bison concatenates
	5398	the specified code in the order in which it appears in the grammar
	5399	file.
	5400
	5401	Not all qualifiers are accepted for all target languages. Unaccepted
	5402	qualifiers produce an error. Some of the accepted qualifiers are:
	5403
	5404	@itemize @bullet
	5405	@item requires
	5406	@findex %code requires
	5407
	5408	@itemize @bullet
	5409	@item Language(s): C, C++
	5410
	5411	@item Purpose: This is the best place to write dependency code required for
	5412	@code{YYSTYPE} and @code{YYLTYPE}.
	5413	In other words, it's the best place to define types referenced in @code{%union}
	5414	directives, and it's the best place to override Bison's default @code{YYSTYPE}
	5415	and @code{YYLTYPE} definitions.
	5416
	5417	@item Location(s): The parser header file and the parser implementation file
	5418	before the Bison-generated @code{YYSTYPE} and @code{YYLTYPE}
	5419	definitions.
	5420	@end itemize
	5421
	5422	@item provides
	5423	@findex %code provides
	5424
	5425	@itemize @bullet
	5426	@item Language(s): C, C++
	5427
	5428	@item Purpose: This is the best place to write additional definitions and
	5429	declarations that should be provided to other modules.
	5430
	5431	@item Location(s): The parser header file and the parser implementation
	5432	file after the Bison-generated @code{YYSTYPE}, @code{YYLTYPE}, and
	5433	token definitions.
	5434	@end itemize
	5435
	5436	@item top
	5437	@findex %code top
	5438
	5439	@itemize @bullet
	5440	@item Language(s): C, C++
	5441
	5442	@item Purpose: The unqualified @code{%code} or @code{%code requires}
	5443	should usually be more appropriate than @code{%code top}. However,
	5444	occasionally it is necessary to insert code much nearer the top of the
	5445	parser implementation file. For example:
	5446
	5447	@example
	5448	%code top @{
	5449	#define _GNU_SOURCE
	5450	#include <stdio.h>
	5451	@}
	5452	@end example
	5453
	5454	@item Location(s): Near the top of the parser implementation file.
	5455	@end itemize
	5456
	5457	@item imports
	5458	@findex %code imports
	5459
	5460	@itemize @bullet
	5461	@item Language(s): Java
	5462
	5463	@item Purpose: This is the best place to write Java import directives.
	5464
	5465	@item Location(s): The parser Java file after any Java package directive and
	5466	before any class definitions.
	5467	@end itemize
	5468	@end itemize
	5469
	5470	Though we say the insertion locations are language-dependent, they are
	5471	technically skeleton-dependent. Writers of non-standard skeletons
	5472	however should choose their locations consistently with the behavior
	5473	of the standard Bison skeletons.
	5474
	5475
	5476	@node Multiple Parsers
	5477	@section Multiple Parsers in the Same Program
	5478
	5479	Most programs that use Bison parse only one language and therefore contain
	5480	only one Bison parser. But what if you want to parse more than one
	5481	language with the same program? Then you need to avoid a name conflict
	5482	between different definitions of @code{yyparse}, @code{yylval}, and so on.
	5483
	5484	The easy way to do this is to use the option @samp{-p @var{prefix}}
	5485	(@pxref{Invocation, ,Invoking Bison}). This renames the interface
	5486	functions and variables of the Bison parser to start with @var{prefix}
	5487	instead of @samp{yy}. You can use this to give each parser distinct
	5488	names that do not conflict.
	5489
	5490	The precise list of symbols renamed is @code{yyparse}, @code{yylex},
	5491	@code{yyerror}, @code{yynerrs}, @code{yylval}, @code{yylloc},
	5492	@code{yychar} and @code{yydebug}. If you use a push parser,
	5493	@code{yypush_parse}, @code{yypull_parse}, @code{yypstate},
	5494	@code{yypstate_new} and @code{yypstate_delete} will also be renamed.
	5495	For example, if you use @samp{-p c}, the names become @code{cparse},
	5496	@code{clex}, and so on.
	5497
	5498	@strong{All the other variables and macros associated with Bison are not
	5499	renamed.} These others are not global; there is no conflict if the same
	5500	name is used in different parsers. For example, @code{YYSTYPE} is not
	5501	renamed, but defining this in different ways in different parsers causes
	5502	no trouble (@pxref{Value Type, ,Data Types of Semantic Values}).
	5503
	5504	The @samp{-p} option works by adding macro definitions to the
	5505	beginning of the parser implementation file, defining @code{yyparse}
	5506	as @code{@var{prefix}parse}, and so on. This effectively substitutes
	5507	one name for the other in the entire parser implementation file.
	5508
	5509	@node Interface
	5510	@chapter Parser C-Language Interface
	5511	@cindex C-language interface
	5512	@cindex interface
	5513
	5514	The Bison parser is actually a C function named @code{yyparse}. Here we
	5515	describe the interface conventions of @code{yyparse} and the other
	5516	functions that it needs to use.
	5517
	5518	Keep in mind that the parser uses many C identifiers starting with
	5519	@samp{yy} and @samp{YY} for internal purposes. If you use such an
	5520	identifier (aside from those in this manual) in an action or in epilogue
	5521	in the grammar file, you are likely to run into trouble.
	5522
	5523	@menu
	5524	* Parser Function:: How to call @code{yyparse} and what it returns.
	5525	* Push Parser Function:: How to call @code{yypush_parse} and what it returns.
	5526	* Pull Parser Function:: How to call @code{yypull_parse} and what it returns.
	5527	* Parser Create Function:: How to call @code{yypstate_new} and what it returns.
	5528	* Parser Delete Function:: How to call @code{yypstate_delete} and what it returns.
	5529	* Lexical:: You must supply a function @code{yylex}
	5530	which reads tokens.
	5531	* Error Reporting:: You must supply a function @code{yyerror}.
	5532	* Action Features:: Special features for use in actions.
	5533	* Internationalization:: How to let the parser speak in the user's
	5534	native language.
	5535	@end menu
	5536
	5537	@node Parser Function
	5538	@section The Parser Function @code{yyparse}
	5539	@findex yyparse
	5540
	5541	You call the function @code{yyparse} to cause parsing to occur. This
	5542	function reads tokens, executes actions, and ultimately returns when it
	5543	encounters end-of-input or an unrecoverable syntax error. You can also
	5544	write an action which directs @code{yyparse} to return immediately
	5545	without reading further.
	5546
	5547
	5548	@deftypefun int yyparse (void)
	5549	The value returned by @code{yyparse} is 0 if parsing was successful (return
	5550	is due to end-of-input).
	5551
	5552	The value is 1 if parsing failed because of invalid input, i.e., input
	5553	that contains a syntax error or that causes @code{YYABORT} to be
	5554	invoked.
	5555
	5556	The value is 2 if parsing failed due to memory exhaustion.
	5557	@end deftypefun
	5558
	5559	In an action, you can cause immediate return from @code{yyparse} by using
	5560	these macros:
	5561
	5562	@defmac YYACCEPT
	5563	@findex YYACCEPT
	5564	Return immediately with value 0 (to report success).
	5565	@end defmac
	5566
	5567	@defmac YYABORT
	5568	@findex YYABORT
	5569	Return immediately with value 1 (to report failure).
	5570	@end defmac
	5571
	5572	If you use a reentrant parser, you can optionally pass additional
	5573	parameter information to it in a reentrant way. To do so, use the
	5574	declaration @code{%parse-param}:
	5575
	5576	@deffn {Directive} %parse-param @{@var{argument-declaration}@}
	5577	@findex %parse-param
	5578	Declare that an argument declared by the braced-code
	5579	@var{argument-declaration} is an additional @code{yyparse} argument.
	5580	The @var{argument-declaration} is used when declaring
	5581	functions or prototypes. The last identifier in
	5582	@var{argument-declaration} must be the argument name.
	5583	@end deffn
	5584
	5585	Here's an example. Write this in the parser:
	5586
	5587	@example
	5588	%parse-param @{int *nastiness@}
	5589	%parse-param @{int *randomness@}
	5590	@end example
	5591
	5592	@noindent
	5593	Then call the parser like this:
	5594
	5595	@example
	5596	@{
	5597	int nastiness, randomness;
	5598	@dots{} /* @r{Store proper data in @code{nastiness} and @code{randomness}.} */
	5599	value = yyparse (&nastiness, &randomness);
	5600	@dots{}
	5601	@}
	5602	@end example
	5603
	5604	@noindent
	5605	In the grammar actions, use expressions like this to refer to the data:
	5606
	5607	@example
	5608	exp: @dots{} @{ @dots{}; *randomness += 1; @dots{} @}
	5609	@end example
	5610
	5611	@node Push Parser Function
	5612	@section The Push Parser Function @code{yypush_parse}
	5613	@findex yypush_parse
	5614
	5615	(The current push parsing interface is experimental and may evolve.
	5616	More user feedback will help to stabilize it.)
	5617
	5618	You call the function @code{yypush_parse} to parse a single token. This
	5619	function is available if either the @code{%define api.push-pull push} or
	5620	@code{%define api.push-pull both} declaration is used.
	5621	@xref{Push Decl, ,A Push Parser}.
	5622
	5623	@deftypefun int yypush_parse (yypstate *yyps)
	5624	The value returned by @code{yypush_parse} is the same as for yyparse with the
	5625	following exception. @code{yypush_parse} will return YYPUSH_MORE if more input
	5626	is required to finish parsing the grammar.
	5627	@end deftypefun
	5628
	5629	@node Pull Parser Function
	5630	@section The Pull Parser Function @code{yypull_parse}
	5631	@findex yypull_parse
	5632
	5633	(The current push parsing interface is experimental and may evolve.
	5634	More user feedback will help to stabilize it.)
	5635
	5636	You call the function @code{yypull_parse} to parse the rest of the input
	5637	stream. This function is available if the @code{%define api.push-pull both}
	5638	declaration is used.
	5639	@xref{Push Decl, ,A Push Parser}.
	5640
	5641	@deftypefun int yypull_parse (yypstate *yyps)
	5642	The value returned by @code{yypull_parse} is the same as for @code{yyparse}.
	5643	@end deftypefun
	5644
	5645	@node Parser Create Function
	5646	@section The Parser Create Function @code{yystate_new}
	5647	@findex yypstate_new
	5648
	5649	(The current push parsing interface is experimental and may evolve.
	5650	More user feedback will help to stabilize it.)
	5651
	5652	You call the function @code{yypstate_new} to create a new parser instance.
	5653	This function is available if either the @code{%define api.push-pull push} or
	5654	@code{%define api.push-pull both} declaration is used.
	5655	@xref{Push Decl, ,A Push Parser}.
	5656
	5657	@deftypefun yypstate *yypstate_new (void)
	5658	The function will return a valid parser instance if there was memory available
	5659	or 0 if no memory was available.
	5660	In impure mode, it will also return 0 if a parser instance is currently
	5661	allocated.
	5662	@end deftypefun
	5663
	5664	@node Parser Delete Function
	5665	@section The Parser Delete Function @code{yystate_delete}
	5666	@findex yypstate_delete
	5667
	5668	(The current push parsing interface is experimental and may evolve.
	5669	More user feedback will help to stabilize it.)
	5670
	5671	You call the function @code{yypstate_delete} to delete a parser instance.
	5672	function is available if either the @code{%define api.push-pull push} or
	5673	@code{%define api.push-pull both} declaration is used.
	5674	@xref{Push Decl, ,A Push Parser}.
	5675
	5676	@deftypefun void yypstate_delete (yypstate *yyps)
	5677	This function will reclaim the memory associated with a parser instance.
	5678	After this call, you should no longer attempt to use the parser instance.
	5679	@end deftypefun
	5680
	5681	@node Lexical
	5682	@section The Lexical Analyzer Function @code{yylex}
	5683	@findex yylex
	5684	@cindex lexical analyzer
	5685
	5686	The @dfn{lexical analyzer} function, @code{yylex}, recognizes tokens from
	5687	the input stream and returns them to the parser. Bison does not create
	5688	this function automatically; you must write it so that @code{yyparse} can
	5689	call it. The function is sometimes referred to as a lexical scanner.
	5690
	5691	In simple programs, @code{yylex} is often defined at the end of the
	5692	Bison grammar file. If @code{yylex} is defined in a separate source
	5693	file, you need to arrange for the token-type macro definitions to be
	5694	available there. To do this, use the @samp{-d} option when you run
	5695	Bison, so that it will write these macro definitions into the separate
	5696	parser header file, @file{@var{name}.tab.h}, which you can include in
	5697	the other source files that need it. @xref{Invocation, ,Invoking
	5698	Bison}.
	5699
	5700	@menu
	5701	* Calling Convention:: How @code{yyparse} calls @code{yylex}.
	5702	* Token Values:: How @code{yylex} must return the semantic value
	5703	of the token it has read.
	5704	* Token Locations:: How @code{yylex} must return the text location
	5705	(line number, etc.) of the token, if the
	5706	actions want that.
	5707	* Pure Calling:: How the calling convention differs in a pure parser
	5708	(@pxref{Pure Decl, ,A Pure (Reentrant) Parser}).
	5709	@end menu
	5710
	5711	@node Calling Convention
	5712	@subsection Calling Convention for @code{yylex}
	5713
	5714	The value that @code{yylex} returns must be the positive numeric code
	5715	for the type of token it has just found; a zero or negative value
	5716	signifies end-of-input.
	5717
	5718	When a token is referred to in the grammar rules by a name, that name
	5719	in the parser implementation file becomes a C macro whose definition
	5720	is the proper numeric code for that token type. So @code{yylex} can
	5721	use the name to indicate that type. @xref{Symbols}.
	5722
	5723	When a token is referred to in the grammar rules by a character literal,
	5724	the numeric code for that character is also the code for the token type.
	5725	So @code{yylex} can simply return that character code, possibly converted
	5726	to @code{unsigned char} to avoid sign-extension. The null character
	5727	must not be used this way, because its code is zero and that
	5728	signifies end-of-input.
	5729
	5730	Here is an example showing these things:
	5731
	5732	@example
	5733	int
	5734	yylex (void)
	5735	@{
	5736	@dots{}
	5737	if (c == EOF) /* Detect end-of-input. */
	5738	return 0;
	5739	@dots{}
	5740	if (c == '+' \|\| c == '-')
	5741	return c; /* Assume token type for `+' is '+'. */
	5742	@dots{}
	5743	return INT; /* Return the type of the token. */
	5744	@dots{}
	5745	@}
	5746	@end example
	5747
	5748	@noindent
	5749	This interface has been designed so that the output from the @code{lex}
	5750	utility can be used without change as the definition of @code{yylex}.
	5751
	5752	If the grammar uses literal string tokens, there are two ways that
	5753	@code{yylex} can determine the token type codes for them:
	5754
	5755	@itemize @bullet
	5756	@item
	5757	If the grammar defines symbolic token names as aliases for the
	5758	literal string tokens, @code{yylex} can use these symbolic names like
	5759	all others. In this case, the use of the literal string tokens in
	5760	the grammar file has no effect on @code{yylex}.
	5761
	5762	@item
	5763	@code{yylex} can find the multicharacter token in the @code{yytname}
	5764	table. The index of the token in the table is the token type's code.
	5765	The name of a multicharacter token is recorded in @code{yytname} with a
	5766	double-quote, the token's characters, and another double-quote. The
	5767	token's characters are escaped as necessary to be suitable as input
	5768	to Bison.
	5769
	5770	Here's code for looking up a multicharacter token in @code{yytname},
	5771	assuming that the characters of the token are stored in
	5772	@code{token_buffer}, and assuming that the token does not contain any
	5773	characters like @samp{"} that require escaping.
	5774
	5775	@example
	5776	for (i = 0; i < YYNTOKENS; i++)
	5777	@{
	5778	if (yytname[i] != 0
	5779	&& yytname[i][0] == '"'
	5780	&& ! strncmp (yytname[i] + 1, token_buffer,
	5781	strlen (token_buffer))
	5782	&& yytname[i][strlen (token_buffer) + 1] == '"'
	5783	&& yytname[i][strlen (token_buffer) + 2] == 0)
	5784	break;
	5785	@}
	5786	@end example
	5787
	5788	The @code{yytname} table is generated only if you use the
	5789	@code{%token-table} declaration. @xref{Decl Summary}.
	5790	@end itemize
	5791
	5792	@node Token Values
	5793	@subsection Semantic Values of Tokens
	5794
	5795	@vindex yylval
	5796	In an ordinary (nonreentrant) parser, the semantic value of the token must
	5797	be stored into the global variable @code{yylval}. When you are using
	5798	just one data type for semantic values, @code{yylval} has that type.
	5799	Thus, if the type is @code{int} (the default), you might write this in
	5800	@code{yylex}:
	5801
	5802	@example
	5803	@group
	5804	@dots{}
	5805	yylval = value; /* Put value onto Bison stack. */
	5806	return INT; /* Return the type of the token. */
	5807	@dots{}
	5808	@end group
	5809	@end example
	5810
	5811	When you are using multiple data types, @code{yylval}'s type is a union
	5812	made from the @code{%union} declaration (@pxref{Union Decl, ,The
	5813	Collection of Value Types}). So when you store a token's value, you
	5814	must use the proper member of the union. If the @code{%union}
	5815	declaration looks like this:
	5816
	5817	@example
	5818	@group
	5819	%union @{
	5820	int intval;
	5821	double val;
	5822	symrec *tptr;
	5823	@}
	5824	@end group
	5825	@end example
	5826
	5827	@noindent
	5828	then the code in @code{yylex} might look like this:
	5829
	5830	@example
	5831	@group
	5832	@dots{}
	5833	yylval.intval = value; /* Put value onto Bison stack. */
	5834	return INT; /* Return the type of the token. */
	5835	@dots{}
	5836	@end group
	5837	@end example
	5838
	5839	@node Token Locations
	5840	@subsection Textual Locations of Tokens
	5841
	5842	@vindex yylloc
	5843	If you are using the @samp{@@@var{n}}-feature (@pxref{Tracking Locations})
	5844	in actions to keep track of the textual locations of tokens and groupings,
	5845	then you must provide this information in @code{yylex}. The function
	5846	@code{yyparse} expects to find the textual location of a token just parsed
	5847	in the global variable @code{yylloc}. So @code{yylex} must store the proper
	5848	data in that variable.
	5849
	5850	By default, the value of @code{yylloc} is a structure and you need only
	5851	initialize the members that are going to be used by the actions. The
	5852	four members are called @code{first_line}, @code{first_column},
	5853	@code{last_line} and @code{last_column}. Note that the use of this
	5854	feature makes the parser noticeably slower.
	5855
	5856	@tindex YYLTYPE
	5857	The data type of @code{yylloc} has the name @code{YYLTYPE}.
	5858
	5859	@node Pure Calling
	5860	@subsection Calling Conventions for Pure Parsers
	5861
	5862	When you use the Bison declaration @code{%define api.pure} to request a
	5863	pure, reentrant parser, the global communication variables @code{yylval}
	5864	and @code{yylloc} cannot be used. (@xref{Pure Decl, ,A Pure (Reentrant)
	5865	Parser}.) In such parsers the two global variables are replaced by
	5866	pointers passed as arguments to @code{yylex}. You must declare them as
	5867	shown here, and pass the information back by storing it through those
	5868	pointers.
	5869
	5870	@example
	5871	int
	5872	yylex (YYSTYPE lvalp, YYLTYPE llocp)
	5873	@{
	5874	@dots{}
	5875	lvalp = value; / Put value onto Bison stack. */
	5876	return INT; /* Return the type of the token. */
	5877	@dots{}
	5878	@}
	5879	@end example
	5880
	5881	If the grammar file does not use the @samp{@@} constructs to refer to
	5882	textual locations, then the type @code{YYLTYPE} will not be defined. In
	5883	this case, omit the second argument; @code{yylex} will be called with
	5884	only one argument.
	5885
	5886
	5887	If you wish to pass the additional parameter data to @code{yylex}, use
	5888	@code{%lex-param} just like @code{%parse-param} (@pxref{Parser
	5889	Function}).
	5890
	5891	@deffn {Directive} lex-param @{@var{argument-declaration}@}
	5892	@findex %lex-param
	5893	Declare that the braced-code @var{argument-declaration} is an
	5894	additional @code{yylex} argument declaration.
	5895	@end deffn
	5896
	5897	For instance:
	5898
	5899	@example
	5900	%parse-param @{int *nastiness@}
	5901	%lex-param @{int *nastiness@}
	5902	%parse-param @{int *randomness@}
	5903	@end example
	5904
	5905	@noindent
	5906	results in the following signature:
	5907
	5908	@example
	5909	int yylex (int *nastiness);
	5910	int yyparse (int nastiness, int randomness);
	5911	@end example
	5912
	5913	If @code{%define api.pure} is added:
	5914
	5915	@example
	5916	int yylex (YYSTYPE lvalp, int nastiness);
	5917	int yyparse (int nastiness, int randomness);
	5918	@end example
	5919
	5920	@noindent
	5921	and finally, if both @code{%define api.pure} and @code{%locations} are used:
	5922
	5923	@example
	5924	int yylex (YYSTYPE lvalp, YYLTYPE llocp, int *nastiness);
	5925	int yyparse (int nastiness, int randomness);
	5926	@end example
	5927
	5928	@node Error Reporting
	5929	@section The Error Reporting Function @code{yyerror}
	5930	@cindex error reporting function
	5931	@findex yyerror
	5932	@cindex parse error
	5933	@cindex syntax error
	5934
	5935	The Bison parser detects a @dfn{syntax error} or @dfn{parse error}
	5936	whenever it reads a token which cannot satisfy any syntax rule. An
	5937	action in the grammar can also explicitly proclaim an error, using the
	5938	macro @code{YYERROR} (@pxref{Action Features, ,Special Features for Use
	5939	in Actions}).
	5940
	5941	The Bison parser expects to report the error by calling an error
	5942	reporting function named @code{yyerror}, which you must supply. It is
	5943	called by @code{yyparse} whenever a syntax error is found, and it
	5944	receives one argument. For a syntax error, the string is normally
	5945	@w{@code{"syntax error"}}.
	5946
	5947	@findex %error-verbose
	5948	If you invoke the directive @code{%error-verbose} in the Bison declarations
	5949	section (@pxref{Bison Declarations, ,The Bison Declarations Section}), then
	5950	Bison provides a more verbose and specific error message string instead of
	5951	just plain @w{@code{"syntax error"}}. However, that message sometimes
	5952	contains incorrect information if LAC is not enabled (@pxref{LAC}).
	5953
	5954	The parser can detect one other kind of error: memory exhaustion. This
	5955	can happen when the input contains constructions that are very deeply
	5956	nested. It isn't likely you will encounter this, since the Bison
	5957	parser normally extends its stack automatically up to a very large limit. But
	5958	if memory is exhausted, @code{yyparse} calls @code{yyerror} in the usual
	5959	fashion, except that the argument string is @w{@code{"memory exhausted"}}.
	5960
	5961	In some cases diagnostics like @w{@code{"syntax error"}} are
	5962	translated automatically from English to some other language before
	5963	they are passed to @code{yyerror}. @xref{Internationalization}.
	5964
	5965	The following definition suffices in simple programs:
	5966
	5967	@example
	5968	@group
	5969	void
	5970	yyerror (char const *s)
	5971	@{
	5972	@end group
	5973	@group
	5974	fprintf (stderr, "%s\n", s);
	5975	@}
	5976	@end group
	5977	@end example
	5978
	5979	After @code{yyerror} returns to @code{yyparse}, the latter will attempt
	5980	error recovery if you have written suitable error recovery grammar rules
	5981	(@pxref{Error Recovery}). If recovery is impossible, @code{yyparse} will
	5982	immediately return 1.
	5983
	5984	Obviously, in location tracking pure parsers, @code{yyerror} should have
	5985	an access to the current location.
	5986	This is indeed the case for the GLR
	5987	parsers, but not for the Yacc parser, for historical reasons. I.e., if
	5988	@samp{%locations %define api.pure} is passed then the prototypes for
	5989	@code{yyerror} are:
	5990
	5991	@example
	5992	void yyerror (char const msg); / Yacc parsers. */
	5993	void yyerror (YYLTYPE locp, char const msg); /* GLR parsers. */
	5994	@end example
	5995
	5996	If @samp{%parse-param @{int *nastiness@}} is used, then:
	5997
	5998	@example
	5999	void yyerror (int nastiness, char const msg); /* Yacc parsers. */
	6000	void yyerror (int nastiness, char const msg); /* GLR parsers. */
	6001	@end example
	6002
	6003	Finally, GLR and Yacc parsers share the same @code{yyerror} calling
	6004	convention for absolutely pure parsers, i.e., when the calling
	6005	convention of @code{yylex} @emph{and} the calling convention of
	6006	@code{%define api.pure} are pure.
	6007	I.e.:
	6008
	6009	@example
	6010	/* Location tracking. */
	6011	%locations
	6012	/* Pure yylex. */
	6013	%define api.pure
	6014	%lex-param @{int *nastiness@}
	6015	/* Pure yyparse. */
	6016	%parse-param @{int *nastiness@}
	6017	%parse-param @{int *randomness@}
	6018	@end example
	6019
	6020	@noindent
	6021	results in the following signatures for all the parser kinds:
	6022
	6023	@example
	6024	int yylex (YYSTYPE lvalp, YYLTYPE llocp, int *nastiness);
	6025	int yyparse (int nastiness, int randomness);
	6026	void yyerror (YYLTYPE *locp,
	6027	int nastiness, int randomness,
	6028	char const *msg);
	6029	@end example
	6030
	6031	@noindent
	6032	The prototypes are only indications of how the code produced by Bison
	6033	uses @code{yyerror}. Bison-generated code always ignores the returned
	6034	value, so @code{yyerror} can return any type, including @code{void}.
	6035	Also, @code{yyerror} can be a variadic function; that is why the
	6036	message is always passed last.
	6037
	6038	Traditionally @code{yyerror} returns an @code{int} that is always
	6039	ignored, but this is purely for historical reasons, and @code{void} is
	6040	preferable since it more accurately describes the return type for
	6041	@code{yyerror}.
	6042
	6043	@vindex yynerrs
	6044	The variable @code{yynerrs} contains the number of syntax errors
	6045	reported so far. Normally this variable is global; but if you
	6046	request a pure parser (@pxref{Pure Decl, ,A Pure (Reentrant) Parser})
	6047	then it is a local variable which only the actions can access.
	6048
	6049	@node Action Features
	6050	@section Special Features for Use in Actions
	6051	@cindex summary, action features
	6052	@cindex action features summary
	6053
	6054	Here is a table of Bison constructs, variables and macros that
	6055	are useful in actions.
	6056
	6057	@deffn {Variable} $$
	6058	Acts like a variable that contains the semantic value for the
	6059	grouping made by the current rule. @xref{Actions}.
	6060	@end deffn
	6061
	6062	@deffn {Variable} $@var{n}
	6063	Acts like a variable that contains the semantic value for the
	6064	@var{n}th component of the current rule. @xref{Actions}.
	6065	@end deffn
	6066
	6067	@deffn {Variable} $<@var{typealt}>$
	6068	Like @code{$$} but specifies alternative @var{typealt} in the union
	6069	specified by the @code{%union} declaration. @xref{Action Types, ,Data
	6070	Types of Values in Actions}.
	6071	@end deffn
	6072
	6073	@deffn {Variable} $<@var{typealt}>@var{n}
	6074	Like @code{$@var{n}} but specifies alternative @var{typealt} in the
	6075	union specified by the @code{%union} declaration.
	6076	@xref{Action Types, ,Data Types of Values in Actions}.
	6077	@end deffn
	6078
	6079	@deffn {Macro} YYABORT;
	6080	Return immediately from @code{yyparse}, indicating failure.
	6081	@xref{Parser Function, ,The Parser Function @code{yyparse}}.
	6082	@end deffn
	6083
	6084	@deffn {Macro} YYACCEPT;
	6085	Return immediately from @code{yyparse}, indicating success.
	6086	@xref{Parser Function, ,The Parser Function @code{yyparse}}.
	6087	@end deffn
	6088
	6089	@deffn {Macro} YYBACKUP (@var{token}, @var{value});
	6090	@findex YYBACKUP
	6091	Unshift a token. This macro is allowed only for rules that reduce
	6092	a single value, and only when there is no lookahead token.
	6093	It is also disallowed in GLR parsers.
	6094	It installs a lookahead token with token type @var{token} and
	6095	semantic value @var{value}; then it discards the value that was
	6096	going to be reduced by this rule.
	6097
	6098	If the macro is used when it is not valid, such as when there is
	6099	a lookahead token already, then it reports a syntax error with
	6100	a message @samp{cannot back up} and performs ordinary error
	6101	recovery.
	6102
	6103	In either case, the rest of the action is not executed.
	6104	@end deffn
	6105
	6106	@deffn {Macro} YYEMPTY
	6107	@vindex YYEMPTY
	6108	Value stored in @code{yychar} when there is no lookahead token.
	6109	@end deffn
	6110
	6111	@deffn {Macro} YYEOF
	6112	@vindex YYEOF
	6113	Value stored in @code{yychar} when the lookahead is the end of the input
	6114	stream.
	6115	@end deffn
	6116
	6117	@deffn {Macro} YYERROR;
	6118	@findex YYERROR
	6119	Cause an immediate syntax error. This statement initiates error
	6120	recovery just as if the parser itself had detected an error; however, it
	6121	does not call @code{yyerror}, and does not print any message. If you
	6122	want to print an error message, call @code{yyerror} explicitly before
	6123	the @samp{YYERROR;} statement. @xref{Error Recovery}.
	6124	@end deffn
	6125
	6126	@deffn {Macro} YYRECOVERING
	6127	@findex YYRECOVERING
	6128	The expression @code{YYRECOVERING ()} yields 1 when the parser
	6129	is recovering from a syntax error, and 0 otherwise.
	6130	@xref{Error Recovery}.
	6131	@end deffn
	6132
	6133	@deffn {Variable} yychar
	6134	Variable containing either the lookahead token, or @code{YYEOF} when the
	6135	lookahead is the end of the input stream, or @code{YYEMPTY} when no lookahead
	6136	has been performed so the next token is not yet known.
	6137	Do not modify @code{yychar} in a deferred semantic action (@pxref{GLR Semantic
	6138	Actions}).
	6139	@xref{Lookahead, ,Lookahead Tokens}.
	6140	@end deffn
	6141
	6142	@deffn {Macro} yyclearin;
	6143	Discard the current lookahead token. This is useful primarily in
	6144	error rules.
	6145	Do not invoke @code{yyclearin} in a deferred semantic action (@pxref{GLR
	6146	Semantic Actions}).
	6147	@xref{Error Recovery}.
	6148	@end deffn
	6149
	6150	@deffn {Macro} yyerrok;
	6151	Resume generating error messages immediately for subsequent syntax
	6152	errors. This is useful primarily in error rules.
	6153	@xref{Error Recovery}.
	6154	@end deffn
	6155
	6156	@deffn {Variable} yylloc
	6157	Variable containing the lookahead token location when @code{yychar} is not set
	6158	to @code{YYEMPTY} or @code{YYEOF}.
	6159	Do not modify @code{yylloc} in a deferred semantic action (@pxref{GLR Semantic
	6160	Actions}).
	6161	@xref{Actions and Locations, ,Actions and Locations}.
	6162	@end deffn
	6163
	6164	@deffn {Variable} yylval
	6165	Variable containing the lookahead token semantic value when @code{yychar} is
	6166	not set to @code{YYEMPTY} or @code{YYEOF}.
	6167	Do not modify @code{yylval} in a deferred semantic action (@pxref{GLR Semantic
	6168	Actions}).
	6169	@xref{Actions, ,Actions}.
	6170	@end deffn
	6171
	6172	@deffn {Value} @@$
	6173	@findex @@$
	6174	Acts like a structure variable containing information on the textual
	6175	location of the grouping made by the current rule. @xref{Tracking
	6176	Locations}.
	6177
	6178	@c Check if those paragraphs are still useful or not.
	6179
	6180	@c @example
	6181	@c struct @{
	6182	@c int first_line, last_line;
	6183	@c int first_column, last_column;
	6184	@c @};
	6185	@c @end example
	6186
	6187	@c Thus, to get the starting line number of the third component, you would
	6188	@c use @samp{@@3.first_line}.
	6189
	6190	@c In order for the members of this structure to contain valid information,
	6191	@c you must make @code{yylex} supply this information about each token.
	6192	@c If you need only certain members, then @code{yylex} need only fill in
	6193	@c those members.
	6194
	6195	@c The use of this feature makes the parser noticeably slower.
	6196	@end deffn
	6197
	6198	@deffn {Value} @@@var{n}
	6199	@findex @@@var{n}
	6200	Acts like a structure variable containing information on the textual
	6201	location of the @var{n}th component of the current rule. @xref{Tracking
	6202	Locations}.
	6203	@end deffn
	6204
	6205	@node Internationalization
	6206	@section Parser Internationalization
	6207	@cindex internationalization
	6208	@cindex i18n
	6209	@cindex NLS
	6210	@cindex gettext
	6211	@cindex bison-po
	6212
	6213	A Bison-generated parser can print diagnostics, including error and
	6214	tracing messages. By default, they appear in English. However, Bison
	6215	also supports outputting diagnostics in the user's native language. To
	6216	make this work, the user should set the usual environment variables.
	6217	@xref{Users, , The User's View, gettext, GNU @code{gettext} utilities}.
	6218	For example, the shell command @samp{export LC_ALL=fr_CA.UTF-8} might
	6219	set the user's locale to French Canadian using the UTF-8
	6220	encoding. The exact set of available locales depends on the user's
	6221	installation.
	6222
	6223	The maintainer of a package that uses a Bison-generated parser enables
	6224	the internationalization of the parser's output through the following
	6225	steps. Here we assume a package that uses GNU Autoconf and
	6226	GNU Automake.
	6227
	6228	@enumerate
	6229	@item
	6230	@cindex bison-i18n.m4
	6231	Into the directory containing the GNU Autoconf macros used
	6232	by the package---often called @file{m4}---copy the
	6233	@file{bison-i18n.m4} file installed by Bison under
	6234	@samp{share/aclocal/bison-i18n.m4} in Bison's installation directory.
	6235	For example:
	6236
	6237	@example
	6238	cp /usr/local/share/aclocal/bison-i18n.m4 m4/bison-i18n.m4
	6239	@end example
	6240
	6241	@item
	6242	@findex BISON_I18N
	6243	@vindex BISON_LOCALEDIR
	6244	@vindex YYENABLE_NLS
	6245	In the top-level @file{configure.ac}, after the @code{AM_GNU_GETTEXT}
	6246	invocation, add an invocation of @code{BISON_I18N}. This macro is
	6247	defined in the file @file{bison-i18n.m4} that you copied earlier. It
	6248	causes @samp{configure} to find the value of the
	6249	@code{BISON_LOCALEDIR} variable, and it defines the source-language
	6250	symbol @code{YYENABLE_NLS} to enable translations in the
	6251	Bison-generated parser.
	6252
	6253	@item
	6254	In the @code{main} function of your program, designate the directory
	6255	containing Bison's runtime message catalog, through a call to
	6256	@samp{bindtextdomain} with domain name @samp{bison-runtime}.
	6257	For example:
	6258
	6259	@example
	6260	bindtextdomain ("bison-runtime", BISON_LOCALEDIR);
	6261	@end example
	6262
	6263	Typically this appears after any other call @code{bindtextdomain
	6264	(PACKAGE, LOCALEDIR)} that your package already has. Here we rely on
	6265	@samp{BISON_LOCALEDIR} to be defined as a string through the
	6266	@file{Makefile}.
	6267
	6268	@item
	6269	In the @file{Makefile.am} that controls the compilation of the @code{main}
	6270	function, make @samp{BISON_LOCALEDIR} available as a C preprocessor macro,
	6271	either in @samp{DEFS} or in @samp{AM_CPPFLAGS}. For example:
	6272
	6273	@example
	6274	DEFS = @@DEFS@@ -DBISON_LOCALEDIR='"$(BISON_LOCALEDIR)"'
	6275	@end example
	6276
	6277	or:
	6278
	6279	@example
	6280	AM_CPPFLAGS = -DBISON_LOCALEDIR='"$(BISON_LOCALEDIR)"'
	6281	@end example
	6282
	6283	@item
	6284	Finally, invoke the command @command{autoreconf} to generate the build
	6285	infrastructure.
	6286	@end enumerate
	6287
	6288
	6289	@node Algorithm
	6290	@chapter The Bison Parser Algorithm
	6291	@cindex Bison parser algorithm
	6292	@cindex algorithm of parser
	6293	@cindex shifting
	6294	@cindex reduction
	6295	@cindex parser stack
	6296	@cindex stack, parser
	6297
	6298	As Bison reads tokens, it pushes them onto a stack along with their
	6299	semantic values. The stack is called the @dfn{parser stack}. Pushing a
	6300	token is traditionally called @dfn{shifting}.
	6301
	6302	For example, suppose the infix calculator has read @samp{1 + 5 *}, with a
	6303	@samp{3} to come. The stack will have four elements, one for each token
	6304	that was shifted.
	6305
	6306	But the stack does not always have an element for each token read. When
	6307	the last @var{n} tokens and groupings shifted match the components of a
	6308	grammar rule, they can be combined according to that rule. This is called
	6309	@dfn{reduction}. Those tokens and groupings are replaced on the stack by a
	6310	single grouping whose symbol is the result (left hand side) of that rule.
	6311	Running the rule's action is part of the process of reduction, because this
	6312	is what computes the semantic value of the resulting grouping.
	6313
	6314	For example, if the infix calculator's parser stack contains this:
	6315
	6316	@example
	6317	1 + 5 * 3
	6318	@end example
	6319
	6320	@noindent
	6321	and the next input token is a newline character, then the last three
	6322	elements can be reduced to 15 via the rule:
	6323
	6324	@example
	6325	expr: expr '*' expr;
	6326	@end example
	6327
	6328	@noindent
	6329	Then the stack contains just these three elements:
	6330
	6331	@example
	6332	1 + 15
	6333	@end example
	6334
	6335	@noindent
	6336	At this point, another reduction can be made, resulting in the single value
	6337	16. Then the newline token can be shifted.
	6338
	6339	The parser tries, by shifts and reductions, to reduce the entire input down
	6340	to a single grouping whose symbol is the grammar's start-symbol
	6341	(@pxref{Language and Grammar, ,Languages and Context-Free Grammars}).
	6342
	6343	This kind of parser is known in the literature as a bottom-up parser.
	6344
	6345	@menu
	6346	* Lookahead:: Parser looks one token ahead when deciding what to do.
	6347	* Shift/Reduce:: Conflicts: when either shifting or reduction is valid.
	6348	* Precedence:: Operator precedence works by resolving conflicts.
	6349	* Contextual Precedence:: When an operator's precedence depends on context.
	6350	* Parser States:: The parser is a finite-state-machine with stack.
	6351	* Reduce/Reduce:: When two rules are applicable in the same situation.
	6352	* Mysterious Conflicts:: Conflicts that look unjustified.
	6353	* Tuning LR:: How to tune fundamental aspects of LR-based parsing.
	6354	* Generalized LR Parsing:: Parsing arbitrary context-free grammars.
	6355	* Memory Management:: What happens when memory is exhausted. How to avoid it.
	6356	@end menu
	6357
	6358	@node Lookahead
	6359	@section Lookahead Tokens
	6360	@cindex lookahead token
	6361
	6362	The Bison parser does @emph{not} always reduce immediately as soon as the
	6363	last @var{n} tokens and groupings match a rule. This is because such a
	6364	simple strategy is inadequate to handle most languages. Instead, when a
	6365	reduction is possible, the parser sometimes ``looks ahead'' at the next
	6366	token in order to decide what to do.
	6367
	6368	When a token is read, it is not immediately shifted; first it becomes the
	6369	@dfn{lookahead token}, which is not on the stack. Now the parser can
	6370	perform one or more reductions of tokens and groupings on the stack, while
	6371	the lookahead token remains off to the side. When no more reductions
	6372	should take place, the lookahead token is shifted onto the stack. This
	6373	does not mean that all possible reductions have been done; depending on the
	6374	token type of the lookahead token, some rules may choose to delay their
	6375	application.
	6376
	6377	Here is a simple case where lookahead is needed. These three rules define
	6378	expressions which contain binary addition operators and postfix unary
	6379	factorial operators (@samp{!}), and allow parentheses for grouping.
	6380
	6381	@example
	6382	@group
	6383	expr:
	6384	term '+' expr
	6385	\| term
	6386	;
	6387	@end group
	6388
	6389	@group
	6390	term:
	6391	'(' expr ')'
	6392	\| term '!'
	6393	\| NUMBER
	6394	;
	6395	@end group
	6396	@end example
	6397
	6398	Suppose that the tokens @w{@samp{1 + 2}} have been read and shifted; what
	6399	should be done? If the following token is @samp{)}, then the first three
	6400	tokens must be reduced to form an @code{expr}. This is the only valid
	6401	course, because shifting the @samp{)} would produce a sequence of symbols
	6402	@w{@code{term ')'}}, and no rule allows this.
	6403
	6404	If the following token is @samp{!}, then it must be shifted immediately so
	6405	that @w{@samp{2 !}} can be reduced to make a @code{term}. If instead the
	6406	parser were to reduce before shifting, @w{@samp{1 + 2}} would become an
	6407	@code{expr}. It would then be impossible to shift the @samp{!} because
	6408	doing so would produce on the stack the sequence of symbols @code{expr
	6409	'!'}. No rule allows that sequence.
	6410
	6411	@vindex yychar
	6412	@vindex yylval
	6413	@vindex yylloc
	6414	The lookahead token is stored in the variable @code{yychar}.
	6415	Its semantic value and location, if any, are stored in the variables
	6416	@code{yylval} and @code{yylloc}.
	6417	@xref{Action Features, ,Special Features for Use in Actions}.
	6418
	6419	@node Shift/Reduce
	6420	@section Shift/Reduce Conflicts
	6421	@cindex conflicts
	6422	@cindex shift/reduce conflicts
	6423	@cindex dangling @code{else}
	6424	@cindex @code{else}, dangling
	6425
	6426	Suppose we are parsing a language which has if-then and if-then-else
	6427	statements, with a pair of rules like this:
	6428
	6429	@example
	6430	@group
	6431	if_stmt:
	6432	IF expr THEN stmt
	6433	\| IF expr THEN stmt ELSE stmt
	6434	;
	6435	@end group
	6436	@end example
	6437
	6438	@noindent
	6439	Here we assume that @code{IF}, @code{THEN} and @code{ELSE} are
	6440	terminal symbols for specific keyword tokens.
	6441
	6442	When the @code{ELSE} token is read and becomes the lookahead token, the
	6443	contents of the stack (assuming the input is valid) are just right for
	6444	reduction by the first rule. But it is also legitimate to shift the
	6445	@code{ELSE}, because that would lead to eventual reduction by the second
	6446	rule.
	6447
	6448	This situation, where either a shift or a reduction would be valid, is
	6449	called a @dfn{shift/reduce conflict}. Bison is designed to resolve
	6450	these conflicts by choosing to shift, unless otherwise directed by
	6451	operator precedence declarations. To see the reason for this, let's
	6452	contrast it with the other alternative.
	6453
	6454	Since the parser prefers to shift the @code{ELSE}, the result is to attach
	6455	the else-clause to the innermost if-statement, making these two inputs
	6456	equivalent:
	6457
	6458	@example
	6459	if x then if y then win (); else lose;
	6460
	6461	if x then do; if y then win (); else lose; end;
	6462	@end example
	6463
	6464	But if the parser chose to reduce when possible rather than shift, the
	6465	result would be to attach the else-clause to the outermost if-statement,
	6466	making these two inputs equivalent:
	6467
	6468	@example
	6469	if x then if y then win (); else lose;
	6470
	6471	if x then do; if y then win (); end; else lose;
	6472	@end example
	6473
	6474	The conflict exists because the grammar as written is ambiguous: either
	6475	parsing of the simple nested if-statement is legitimate. The established
	6476	convention is that these ambiguities are resolved by attaching the
	6477	else-clause to the innermost if-statement; this is what Bison accomplishes
	6478	by choosing to shift rather than reduce. (It would ideally be cleaner to
	6479	write an unambiguous grammar, but that is very hard to do in this case.)
	6480	This particular ambiguity was first encountered in the specifications of
	6481	Algol 60 and is called the ``dangling @code{else}'' ambiguity.
	6482
	6483	To avoid warnings from Bison about predictable, legitimate shift/reduce
	6484	conflicts, use the @code{%expect @var{n}} declaration.
	6485	There will be no warning as long as the number of shift/reduce conflicts
	6486	is exactly @var{n}, and Bison will report an error if there is a
	6487	different number.
	6488	@xref{Expect Decl, ,Suppressing Conflict Warnings}.
	6489
	6490	The definition of @code{if_stmt} above is solely to blame for the
	6491	conflict, but the conflict does not actually appear without additional
	6492	rules. Here is a complete Bison grammar file that actually manifests
	6493	the conflict:
	6494
	6495	@example
	6496	@group
	6497	%token IF THEN ELSE variable
	6498	%%
	6499	@end group
	6500	@group
	6501	stmt:
	6502	expr
	6503	\| if_stmt
	6504	;
	6505	@end group
	6506
	6507	@group
	6508	if_stmt:
	6509	IF expr THEN stmt
	6510	\| IF expr THEN stmt ELSE stmt
	6511	;
	6512	@end group
	6513
	6514	expr:
	6515	variable
	6516	;
	6517	@end example
	6518
	6519	@node Precedence
	6520	@section Operator Precedence
	6521	@cindex operator precedence
	6522	@cindex precedence of operators
	6523
	6524	Another situation where shift/reduce conflicts appear is in arithmetic
	6525	expressions. Here shifting is not always the preferred resolution; the
	6526	Bison declarations for operator precedence allow you to specify when to
	6527	shift and when to reduce.
	6528
	6529	@menu
	6530	* Why Precedence:: An example showing why precedence is needed.
	6531	* Using Precedence:: How to specify precedence in Bison grammars.
	6532	* Precedence Examples:: How these features are used in the previous example.
	6533	* How Precedence:: How they work.
	6534	@end menu
	6535
	6536	@node Why Precedence
	6537	@subsection When Precedence is Needed
	6538
	6539	Consider the following ambiguous grammar fragment (ambiguous because the
	6540	input @w{@samp{1 - 2 * 3}} can be parsed in two different ways):
	6541
	6542	@example
	6543	@group
	6544	expr:
	6545	expr '-' expr
	6546	\| expr '*' expr
	6547	\| expr '<' expr
	6548	\| '(' expr ')'
	6549	@dots{}
	6550	;
	6551	@end group
	6552	@end example
	6553
	6554	@noindent
	6555	Suppose the parser has seen the tokens @samp{1}, @samp{-} and @samp{2};
	6556	should it reduce them via the rule for the subtraction operator? It
	6557	depends on the next token. Of course, if the next token is @samp{)}, we
	6558	must reduce; shifting is invalid because no single rule can reduce the
	6559	token sequence @w{@samp{- 2 )}} or anything starting with that. But if
	6560	the next token is @samp{*} or @samp{<}, we have a choice: either
	6561	shifting or reduction would allow the parse to complete, but with
	6562	different results.
	6563
	6564	To decide which one Bison should do, we must consider the results. If
	6565	the next operator token @var{op} is shifted, then it must be reduced
	6566	first in order to permit another opportunity to reduce the difference.
	6567	The result is (in effect) @w{@samp{1 - (2 @var{op} 3)}}. On the other
	6568	hand, if the subtraction is reduced before shifting @var{op}, the result
	6569	is @w{@samp{(1 - 2) @var{op} 3}}. Clearly, then, the choice of shift or
	6570	reduce should depend on the relative precedence of the operators
	6571	@samp{-} and @var{op}: @samp{*} should be shifted first, but not
	6572	@samp{<}.
	6573
	6574	@cindex associativity
	6575	What about input such as @w{@samp{1 - 2 - 5}}; should this be
	6576	@w{@samp{(1 - 2) - 5}} or should it be @w{@samp{1 - (2 - 5)}}? For most
	6577	operators we prefer the former, which is called @dfn{left association}.
	6578	The latter alternative, @dfn{right association}, is desirable for
	6579	assignment operators. The choice of left or right association is a
	6580	matter of whether the parser chooses to shift or reduce when the stack
	6581	contains @w{@samp{1 - 2}} and the lookahead token is @samp{-}: shifting
	6582	makes right-associativity.
	6583
	6584	@node Using Precedence
	6585	@subsection Specifying Operator Precedence
	6586	@findex %left
	6587	@findex %right
	6588	@findex %nonassoc
	6589
	6590	Bison allows you to specify these choices with the operator precedence
	6591	declarations @code{%left} and @code{%right}. Each such declaration
	6592	contains a list of tokens, which are operators whose precedence and
	6593	associativity is being declared. The @code{%left} declaration makes all
	6594	those operators left-associative and the @code{%right} declaration makes
	6595	them right-associative. A third alternative is @code{%nonassoc}, which
	6596	declares that it is a syntax error to find the same operator twice ``in a
	6597	row''.
	6598
	6599	The relative precedence of different operators is controlled by the
	6600	order in which they are declared. The first @code{%left} or
	6601	@code{%right} declaration in the file declares the operators whose
	6602	precedence is lowest, the next such declaration declares the operators
	6603	whose precedence is a little higher, and so on.
	6604
	6605	@node Precedence Examples
	6606	@subsection Precedence Examples
	6607
	6608	In our example, we would want the following declarations:
	6609
	6610	@example
	6611	%left '<'
	6612	%left '-'
	6613	%left '*'
	6614	@end example
	6615
	6616	In a more complete example, which supports other operators as well, we
	6617	would declare them in groups of equal precedence. For example, @code{'+'} is
	6618	declared with @code{'-'}:
	6619
	6620	@example
	6621	%left '<' '>' '=' NE LE GE
	6622	%left '+' '-'
	6623	%left '*' '/'
	6624	@end example
	6625
	6626	@noindent
	6627	(Here @code{NE} and so on stand for the operators for ``not equal''
	6628	and so on. We assume that these tokens are more than one character long
	6629	and therefore are represented by names, not character literals.)
	6630
	6631	@node How Precedence
	6632	@subsection How Precedence Works
	6633
	6634	The first effect of the precedence declarations is to assign precedence
	6635	levels to the terminal symbols declared. The second effect is to assign
	6636	precedence levels to certain rules: each rule gets its precedence from
	6637	the last terminal symbol mentioned in the components. (You can also
	6638	specify explicitly the precedence of a rule. @xref{Contextual
	6639	Precedence, ,Context-Dependent Precedence}.)
	6640
	6641	Finally, the resolution of conflicts works by comparing the precedence
	6642	of the rule being considered with that of the lookahead token. If the
	6643	token's precedence is higher, the choice is to shift. If the rule's
	6644	precedence is higher, the choice is to reduce. If they have equal
	6645	precedence, the choice is made based on the associativity of that
	6646	precedence level. The verbose output file made by @samp{-v}
	6647	(@pxref{Invocation, ,Invoking Bison}) says how each conflict was
	6648	resolved.
	6649
	6650	Not all rules and not all tokens have precedence. If either the rule or
	6651	the lookahead token has no precedence, then the default is to shift.
	6652
	6653	@node Contextual Precedence
	6654	@section Context-Dependent Precedence
	6655	@cindex context-dependent precedence
	6656	@cindex unary operator precedence
	6657	@cindex precedence, context-dependent
	6658	@cindex precedence, unary operator
	6659	@findex %prec
	6660
	6661	Often the precedence of an operator depends on the context. This sounds
	6662	outlandish at first, but it is really very common. For example, a minus
	6663	sign typically has a very high precedence as a unary operator, and a
	6664	somewhat lower precedence (lower than multiplication) as a binary operator.
	6665
	6666	The Bison precedence declarations, @code{%left}, @code{%right} and
	6667	@code{%nonassoc}, can only be used once for a given token; so a token has
	6668	only one precedence declared in this way. For context-dependent
	6669	precedence, you need to use an additional mechanism: the @code{%prec}
	6670	modifier for rules.
	6671
	6672	The @code{%prec} modifier declares the precedence of a particular rule by
	6673	specifying a terminal symbol whose precedence should be used for that rule.
	6674	It's not necessary for that symbol to appear otherwise in the rule. The
	6675	modifier's syntax is:
	6676
	6677	@example
	6678	%prec @var{terminal-symbol}
	6679	@end example
	6680
	6681	@noindent
	6682	and it is written after the components of the rule. Its effect is to
	6683	assign the rule the precedence of @var{terminal-symbol}, overriding
	6684	the precedence that would be deduced for it in the ordinary way. The
	6685	altered rule precedence then affects how conflicts involving that rule
	6686	are resolved (@pxref{Precedence, ,Operator Precedence}).
	6687
	6688	Here is how @code{%prec} solves the problem of unary minus. First, declare
	6689	a precedence for a fictitious terminal symbol named @code{UMINUS}. There
	6690	are no tokens of this type, but the symbol serves to stand for its
	6691	precedence:
	6692
	6693	@example
	6694	@dots{}
	6695	%left '+' '-'
	6696	%left '*'
	6697	%left UMINUS
	6698	@end example
	6699
	6700	Now the precedence of @code{UMINUS} can be used in specific rules:
	6701
	6702	@example
	6703	@group
	6704	exp:
	6705	@dots{}
	6706	\| exp '-' exp
	6707	@dots{}
	6708	\| '-' exp %prec UMINUS
	6709	@end group
	6710	@end example
	6711
	6712	@ifset defaultprec
	6713	If you forget to append @code{%prec UMINUS} to the rule for unary
	6714	minus, Bison silently assumes that minus has its usual precedence.
	6715	This kind of problem can be tricky to debug, since one typically
	6716	discovers the mistake only by testing the code.
	6717
	6718	The @code{%no-default-prec;} declaration makes it easier to discover
	6719	this kind of problem systematically. It causes rules that lack a
	6720	@code{%prec} modifier to have no precedence, even if the last terminal
	6721	symbol mentioned in their components has a declared precedence.
	6722
	6723	If @code{%no-default-prec;} is in effect, you must specify @code{%prec}
	6724	for all rules that participate in precedence conflict resolution.
	6725	Then you will see any shift/reduce conflict until you tell Bison how
	6726	to resolve it, either by changing your grammar or by adding an
	6727	explicit precedence. This will probably add declarations to the
	6728	grammar, but it helps to protect against incorrect rule precedences.
	6729
	6730	The effect of @code{%no-default-prec;} can be reversed by giving
	6731	@code{%default-prec;}, which is the default.
	6732	@end ifset
	6733
	6734	@node Parser States
	6735	@section Parser States
	6736	@cindex finite-state machine
	6737	@cindex parser state
	6738	@cindex state (of parser)
	6739
	6740	The function @code{yyparse} is implemented using a finite-state machine.
	6741	The values pushed on the parser stack are not simply token type codes; they
	6742	represent the entire sequence of terminal and nonterminal symbols at or
	6743	near the top of the stack. The current state collects all the information
	6744	about previous input which is relevant to deciding what to do next.
	6745
	6746	Each time a lookahead token is read, the current parser state together
	6747	with the type of lookahead token are looked up in a table. This table
	6748	entry can say, ``Shift the lookahead token.'' In this case, it also
	6749	specifies the new parser state, which is pushed onto the top of the
	6750	parser stack. Or it can say, ``Reduce using rule number @var{n}.''
	6751	This means that a certain number of tokens or groupings are taken off
	6752	the top of the stack, and replaced by one grouping. In other words,
	6753	that number of states are popped from the stack, and one new state is
	6754	pushed.
	6755
	6756	There is one other alternative: the table can say that the lookahead token
	6757	is erroneous in the current state. This causes error processing to begin
	6758	(@pxref{Error Recovery}).
	6759
	6760	@node Reduce/Reduce
	6761	@section Reduce/Reduce Conflicts
	6762	@cindex reduce/reduce conflict
	6763	@cindex conflicts, reduce/reduce
	6764
	6765	A reduce/reduce conflict occurs if there are two or more rules that apply
	6766	to the same sequence of input. This usually indicates a serious error
	6767	in the grammar.
	6768
	6769	For example, here is an erroneous attempt to define a sequence
	6770	of zero or more @code{word} groupings.
	6771
	6772	@example
	6773	@group
	6774	sequence:
	6775	/* empty */ @{ printf ("empty sequence\n"); @}
	6776	\| maybeword
	6777	\| sequence word @{ printf ("added word %s\n", $2); @}
	6778	;
	6779	@end group
	6780
	6781	@group
	6782	maybeword:
	6783	/* empty */ @{ printf ("empty maybeword\n"); @}
	6784	\| word @{ printf ("single word %s\n", $1); @}
	6785	;
	6786	@end group
	6787	@end example
	6788
	6789	@noindent
	6790	The error is an ambiguity: there is more than one way to parse a single
	6791	@code{word} into a @code{sequence}. It could be reduced to a
	6792	@code{maybeword} and then into a @code{sequence} via the second rule.
	6793	Alternatively, nothing-at-all could be reduced into a @code{sequence}
	6794	via the first rule, and this could be combined with the @code{word}
	6795	using the third rule for @code{sequence}.
	6796
	6797	There is also more than one way to reduce nothing-at-all into a
	6798	@code{sequence}. This can be done directly via the first rule,
	6799	or indirectly via @code{maybeword} and then the second rule.
	6800
	6801	You might think that this is a distinction without a difference, because it
	6802	does not change whether any particular input is valid or not. But it does
	6803	affect which actions are run. One parsing order runs the second rule's
	6804	action; the other runs the first rule's action and the third rule's action.
	6805	In this example, the output of the program changes.
	6806
	6807	Bison resolves a reduce/reduce conflict by choosing to use the rule that
	6808	appears first in the grammar, but it is very risky to rely on this. Every
	6809	reduce/reduce conflict must be studied and usually eliminated. Here is the
	6810	proper way to define @code{sequence}:
	6811
	6812	@example
	6813	sequence:
	6814	/* empty */ @{ printf ("empty sequence\n"); @}
	6815	\| sequence word @{ printf ("added word %s\n", $2); @}
	6816	;
	6817	@end example
	6818
	6819	Here is another common error that yields a reduce/reduce conflict:
	6820
	6821	@example
	6822	sequence:
	6823	/* empty */
	6824	\| sequence words
	6825	\| sequence redirects
	6826	;
	6827
	6828	words:
	6829	/* empty */
	6830	\| words word
	6831	;
	6832
	6833	redirects:
	6834	/* empty */
	6835	\| redirects redirect
	6836	;
	6837	@end example
	6838
	6839	@noindent
	6840	The intention here is to define a sequence which can contain either
	6841	@code{word} or @code{redirect} groupings. The individual definitions of
	6842	@code{sequence}, @code{words} and @code{redirects} are error-free, but the
	6843	three together make a subtle ambiguity: even an empty input can be parsed
	6844	in infinitely many ways!
	6845
	6846	Consider: nothing-at-all could be a @code{words}. Or it could be two
	6847	@code{words} in a row, or three, or any number. It could equally well be a
	6848	@code{redirects}, or two, or any number. Or it could be a @code{words}
	6849	followed by three @code{redirects} and another @code{words}. And so on.
	6850
	6851	Here are two ways to correct these rules. First, to make it a single level
	6852	of sequence:
	6853
	6854	@example
	6855	sequence:
	6856	/* empty */
	6857	\| sequence word
	6858	\| sequence redirect
	6859	;
	6860	@end example
	6861
	6862	Second, to prevent either a @code{words} or a @code{redirects}
	6863	from being empty:
	6864
	6865	@example
	6866	@group
	6867	sequence:
	6868	/* empty */
	6869	\| sequence words
	6870	\| sequence redirects
	6871	;
	6872	@end group
	6873
	6874	@group
	6875	words:
	6876	word
	6877	\| words word
	6878	;
	6879	@end group
	6880
	6881	@group
	6882	redirects:
	6883	redirect
	6884	\| redirects redirect
	6885	;
	6886	@end group
	6887	@end example
	6888
	6889	@node Mysterious Conflicts
	6890	@section Mysterious Conflicts
	6891	@cindex Mysterious Conflicts
	6892
	6893	Sometimes reduce/reduce conflicts can occur that don't look warranted.
	6894	Here is an example:
	6895
	6896	@example
	6897	@group
	6898	%token ID
	6899
	6900	%%
	6901	def: param_spec return_spec ',';
	6902	param_spec:
	6903	type
	6904	\| name_list ':' type
	6905	;
	6906	@end group
	6907	@group
	6908	return_spec:
	6909	type
	6910	\| name ':' type
	6911	;
	6912	@end group
	6913	@group
	6914	type: ID;
	6915	@end group
	6916	@group
	6917	name: ID;
	6918	name_list:
	6919	name
	6920	\| name ',' name_list
	6921	;
	6922	@end group
	6923	@end example
	6924
	6925	It would seem that this grammar can be parsed with only a single token
	6926	of lookahead: when a @code{param_spec} is being read, an @code{ID} is
	6927	a @code{name} if a comma or colon follows, or a @code{type} if another
	6928	@code{ID} follows. In other words, this grammar is LR(1).
	6929
	6930	@cindex LR
	6931	@cindex LALR
	6932	However, for historical reasons, Bison cannot by default handle all
	6933	LR(1) grammars.
	6934	In this grammar, two contexts, that after an @code{ID} at the beginning
	6935	of a @code{param_spec} and likewise at the beginning of a
	6936	@code{return_spec}, are similar enough that Bison assumes they are the
	6937	same.
	6938	They appear similar because the same set of rules would be
	6939	active---the rule for reducing to a @code{name} and that for reducing to
	6940	a @code{type}. Bison is unable to determine at that stage of processing
	6941	that the rules would require different lookahead tokens in the two
	6942	contexts, so it makes a single parser state for them both. Combining
	6943	the two contexts causes a conflict later. In parser terminology, this
	6944	occurrence means that the grammar is not LALR(1).
	6945
	6946	@cindex IELR
	6947	@cindex canonical LR
	6948	For many practical grammars (specifically those that fall into the non-LR(1)
	6949	class), the limitations of LALR(1) result in difficulties beyond just
	6950	mysterious reduce/reduce conflicts. The best way to fix all these problems
	6951	is to select a different parser table construction algorithm. Either
	6952	IELR(1) or canonical LR(1) would suffice, but the former is more efficient
	6953	and easier to debug during development. @xref{LR Table Construction}, for
	6954	details. (Bison's IELR(1) and canonical LR(1) implementations are
	6955	experimental. More user feedback will help to stabilize them.)
	6956
	6957	If you instead wish to work around LALR(1)'s limitations, you
	6958	can often fix a mysterious conflict by identifying the two parser states
	6959	that are being confused, and adding something to make them look
	6960	distinct. In the above example, adding one rule to
	6961	@code{return_spec} as follows makes the problem go away:
	6962
	6963	@example
	6964	@group
	6965	%token BOGUS
	6966	@dots{}
	6967	%%
	6968	@dots{}
	6969	return_spec:
	6970	type
	6971	\| name ':' type
	6972	\| ID BOGUS /* This rule is never used. */
	6973	;
	6974	@end group
	6975	@end example
	6976
	6977	This corrects the problem because it introduces the possibility of an
	6978	additional active rule in the context after the @code{ID} at the beginning of
	6979	@code{return_spec}. This rule is not active in the corresponding context
	6980	in a @code{param_spec}, so the two contexts receive distinct parser states.
	6981	As long as the token @code{BOGUS} is never generated by @code{yylex},
	6982	the added rule cannot alter the way actual input is parsed.
	6983
	6984	In this particular example, there is another way to solve the problem:
	6985	rewrite the rule for @code{return_spec} to use @code{ID} directly
	6986	instead of via @code{name}. This also causes the two confusing
	6987	contexts to have different sets of active rules, because the one for
	6988	@code{return_spec} activates the altered rule for @code{return_spec}
	6989	rather than the one for @code{name}.
	6990
	6991	@example
	6992	param_spec:
	6993	type
	6994	\| name_list ':' type
	6995	;
	6996	return_spec:
	6997	type
	6998	\| ID ':' type
	6999	;
	7000	@end example
	7001
	7002	For a more detailed exposition of LALR(1) parsers and parser
	7003	generators, @pxref{Bibliography,,DeRemer 1982}.
	7004
	7005	@node Tuning LR
	7006	@section Tuning LR
	7007
	7008	The default behavior of Bison's LR-based parsers is chosen mostly for
	7009	historical reasons, but that behavior is often not robust. For example, in
	7010	the previous section, we discussed the mysterious conflicts that can be
	7011	produced by LALR(1), Bison's default parser table construction algorithm.
	7012	Another example is Bison's @code{%error-verbose} directive, which instructs
	7013	the generated parser to produce verbose syntax error messages, which can
	7014	sometimes contain incorrect information.
	7015
	7016	In this section, we explore several modern features of Bison that allow you
	7017	to tune fundamental aspects of the generated LR-based parsers. Some of
	7018	these features easily eliminate shortcomings like those mentioned above.
	7019	Others can be helpful purely for understanding your parser.
	7020
	7021	Most of the features discussed in this section are still experimental. More
	7022	user feedback will help to stabilize them.
	7023
	7024	@menu
	7025	* LR Table Construction:: Choose a different construction algorithm.
	7026	* Default Reductions:: Disable default reductions.
	7027	* LAC:: Correct lookahead sets in the parser states.
	7028	* Unreachable States:: Keep unreachable parser states for debugging.
	7029	@end menu
	7030
	7031	@node LR Table Construction
	7032	@subsection LR Table Construction
	7033	@cindex Mysterious Conflict
	7034	@cindex LALR
	7035	@cindex IELR
	7036	@cindex canonical LR
	7037	@findex %define lr.type
	7038
	7039	For historical reasons, Bison constructs LALR(1) parser tables by default.
	7040	However, LALR does not possess the full language-recognition power of LR.
	7041	As a result, the behavior of parsers employing LALR parser tables is often
	7042	mysterious. We presented a simple example of this effect in @ref{Mysterious
	7043	Conflicts}.
	7044
	7045	As we also demonstrated in that example, the traditional approach to
	7046	eliminating such mysterious behavior is to restructure the grammar.
	7047	Unfortunately, doing so correctly is often difficult. Moreover, merely
	7048	discovering that LALR causes mysterious behavior in your parser can be
	7049	difficult as well.
	7050
	7051	Fortunately, Bison provides an easy way to eliminate the possibility of such
	7052	mysterious behavior altogether. You simply need to activate a more powerful
	7053	parser table construction algorithm by using the @code{%define lr.type}
	7054	directive.
	7055
	7056	@deffn {Directive} {%define lr.type @var{TYPE}}
	7057	Specify the type of parser tables within the LR(1) family. The accepted
	7058	values for @var{TYPE} are:
	7059
	7060	@itemize
	7061	@item @code{lalr} (default)
	7062	@item @code{ielr}
	7063	@item @code{canonical-lr}
	7064	@end itemize
	7065
	7066	(This feature is experimental. More user feedback will help to stabilize
	7067	it.)
	7068	@end deffn
	7069
	7070	For example, to activate IELR, you might add the following directive to you
	7071	grammar file:
	7072
	7073	@example
	7074	%define lr.type ielr
	7075	@end example
	7076
	7077	@noindent For the example in @ref{Mysterious Conflicts}, the mysterious
	7078	conflict is then eliminated, so there is no need to invest time in
	7079	comprehending the conflict or restructuring the grammar to fix it. If,
	7080	during future development, the grammar evolves such that all mysterious
	7081	behavior would have disappeared using just LALR, you need not fear that
	7082	continuing to use IELR will result in unnecessarily large parser tables.
	7083	That is, IELR generates LALR tables when LALR (using a deterministic parsing
	7084	algorithm) is sufficient to support the full language-recognition power of
	7085	LR. Thus, by enabling IELR at the start of grammar development, you can
	7086	safely and completely eliminate the need to consider LALR's shortcomings.
	7087
	7088	While IELR is almost always preferable, there are circumstances where LALR
	7089	or the canonical LR parser tables described by Knuth
	7090	(@pxref{Bibliography,,Knuth 1965}) can be useful. Here we summarize the
	7091	relative advantages of each parser table construction algorithm within
	7092	Bison:
	7093
	7094	@itemize
	7095	@item LALR
	7096
	7097	There are at least two scenarios where LALR can be worthwhile:
	7098
	7099	@itemize
	7100	@item GLR without static conflict resolution.
	7101
	7102	@cindex GLR with LALR
	7103	When employing GLR parsers (@pxref{GLR Parsers}), if you do not resolve any
	7104	conflicts statically (for example, with @code{%left} or @code{%prec}), then
	7105	the parser explores all potential parses of any given input. In this case,
	7106	the choice of parser table construction algorithm is guaranteed not to alter
	7107	the language accepted by the parser. LALR parser tables are the smallest
	7108	parser tables Bison can currently construct, so they may then be preferable.
	7109	Nevertheless, once you begin to resolve conflicts statically, GLR behaves
	7110	more like a deterministic parser in the syntactic contexts where those
	7111	conflicts appear, and so either IELR or canonical LR can then be helpful to
	7112	avoid LALR's mysterious behavior.
	7113
	7114	@item Malformed grammars.
	7115
	7116	Occasionally during development, an especially malformed grammar with a
	7117	major recurring flaw may severely impede the IELR or canonical LR parser
	7118	table construction algorithm. LALR can be a quick way to construct parser
	7119	tables in order to investigate such problems while ignoring the more subtle
	7120	differences from IELR and canonical LR.
	7121	@end itemize
	7122
	7123	@item IELR
	7124
	7125	IELR (Inadequacy Elimination LR) is a minimal LR algorithm. That is, given
	7126	any grammar (LR or non-LR), parsers using IELR or canonical LR parser tables
	7127	always accept exactly the same set of sentences. However, like LALR, IELR
	7128	merges parser states during parser table construction so that the number of
	7129	parser states is often an order of magnitude less than for canonical LR.
	7130	More importantly, because canonical LR's extra parser states may contain
	7131	duplicate conflicts in the case of non-LR grammars, the number of conflicts
	7132	for IELR is often an order of magnitude less as well. This effect can
	7133	significantly reduce the complexity of developing a grammar.
	7134
	7135	@item Canonical LR
	7136
	7137	@cindex delayed syntax error detection
	7138	@cindex LAC
	7139	@findex %nonassoc
	7140	While inefficient, canonical LR parser tables can be an interesting means to
	7141	explore a grammar because they possess a property that IELR and LALR tables
	7142	do not. That is, if @code{%nonassoc} is not used and default reductions are
	7143	left disabled (@pxref{Default Reductions}), then, for every left context of
	7144	every canonical LR state, the set of tokens accepted by that state is
	7145	guaranteed to be the exact set of tokens that is syntactically acceptable in
	7146	that left context. It might then seem that an advantage of canonical LR
	7147	parsers in production is that, under the above constraints, they are
	7148	guaranteed to detect a syntax error as soon as possible without performing
	7149	any unnecessary reductions. However, IELR parsers that use LAC are also
	7150	able to achieve this behavior without sacrificing @code{%nonassoc} or
	7151	default reductions. For details and a few caveats of LAC, @pxref{LAC}.
	7152	@end itemize
	7153
	7154	For a more detailed exposition of the mysterious behavior in LALR parsers
	7155	and the benefits of IELR, @pxref{Bibliography,,Denny 2008 March}, and
	7156	@ref{Bibliography,,Denny 2010 November}.
	7157
	7158	@node Default Reductions
	7159	@subsection Default Reductions
	7160	@cindex default reductions
	7161	@findex %define lr.default-reductions
	7162	@findex %nonassoc
	7163
	7164	After parser table construction, Bison identifies the reduction with the
	7165	largest lookahead set in each parser state. To reduce the size of the
	7166	parser state, traditional Bison behavior is to remove that lookahead set and
	7167	to assign that reduction to be the default parser action. Such a reduction
	7168	is known as a @dfn{default reduction}.
	7169
	7170	Default reductions affect more than the size of the parser tables. They
	7171	also affect the behavior of the parser:
	7172
	7173	@itemize
	7174	@item Delayed @code{yylex} invocations.
	7175
	7176	@cindex delayed yylex invocations
	7177	@cindex consistent states
	7178	@cindex defaulted states
	7179	A @dfn{consistent state} is a state that has only one possible parser
	7180	action. If that action is a reduction and is encoded as a default
	7181	reduction, then that consistent state is called a @dfn{defaulted state}.
	7182	Upon reaching a defaulted state, a Bison-generated parser does not bother to
	7183	invoke @code{yylex} to fetch the next token before performing the reduction.
	7184	In other words, whether default reductions are enabled in consistent states
	7185	determines how soon a Bison-generated parser invokes @code{yylex} for a
	7186	token: immediately when it @emph{reaches} that token in the input or when it
	7187	eventually @emph{needs} that token as a lookahead to determine the next
	7188	parser action. Traditionally, default reductions are enabled, and so the
	7189	parser exhibits the latter behavior.
	7190
	7191	The presence of defaulted states is an important consideration when
	7192	designing @code{yylex} and the grammar file. That is, if the behavior of
	7193	@code{yylex} can influence or be influenced by the semantic actions
	7194	associated with the reductions in defaulted states, then the delay of the
	7195	next @code{yylex} invocation until after those reductions is significant.
	7196	For example, the semantic actions might pop a scope stack that @code{yylex}
	7197	uses to determine what token to return. Thus, the delay might be necessary
	7198	to ensure that @code{yylex} does not look up the next token in a scope that
	7199	should already be considered closed.
	7200
	7201	@item Delayed syntax error detection.
	7202
	7203	@cindex delayed syntax error detection
	7204	When the parser fetches a new token by invoking @code{yylex}, it checks
	7205	whether there is an action for that token in the current parser state. The
	7206	parser detects a syntax error if and only if either (1) there is no action
	7207	for that token or (2) the action for that token is the error action (due to
	7208	the use of @code{%nonassoc}). However, if there is a default reduction in
	7209	that state (which might or might not be a defaulted state), then it is
	7210	impossible for condition 1 to exist. That is, all tokens have an action.
	7211	Thus, the parser sometimes fails to detect the syntax error until it reaches
	7212	a later state.
	7213
	7214	@cindex LAC
	7215	@c If there's an infinite loop, default reductions can prevent an incorrect
	7216	@c sentence from being rejected.
	7217	While default reductions never cause the parser to accept syntactically
	7218	incorrect sentences, the delay of syntax error detection can have unexpected
	7219	effects on the behavior of the parser. However, the delay can be caused
	7220	anyway by parser state merging and the use of @code{%nonassoc}, and it can
	7221	be fixed by another Bison feature, LAC. We discuss the effects of delayed
	7222	syntax error detection and LAC more in the next section (@pxref{LAC}).
	7223	@end itemize
	7224
	7225	For canonical LR, the only default reduction that Bison enables by default
	7226	is the accept action, which appears only in the accepting state, which has
	7227	no other action and is thus a defaulted state. However, the default accept
	7228	action does not delay any @code{yylex} invocation or syntax error detection
	7229	because the accept action ends the parse.
	7230
	7231	For LALR and IELR, Bison enables default reductions in nearly all states by
	7232	default. There are only two exceptions. First, states that have a shift
	7233	action on the @code{error} token do not have default reductions because
	7234	delayed syntax error detection could then prevent the @code{error} token
	7235	from ever being shifted in that state. However, parser state merging can
	7236	cause the same effect anyway, and LAC fixes it in both cases, so future
	7237	versions of Bison might drop this exception when LAC is activated. Second,
	7238	GLR parsers do not record the default reduction as the action on a lookahead
	7239	token for which there is a conflict. The correct action in this case is to
	7240	split the parse instead.
	7241
	7242	To adjust which states have default reductions enabled, use the
	7243	@code{%define lr.default-reductions} directive.
	7244
	7245	@deffn {Directive} {%define lr.default-reductions @var{WHERE}}
	7246	Specify the kind of states that are permitted to contain default reductions.
	7247	The accepted values of @var{WHERE} are:
	7248	@itemize
	7249	@item @code{most} (default for LALR and IELR)
	7250	@item @code{consistent}
	7251	@item @code{accepting} (default for canonical LR)
	7252	@end itemize
	7253
	7254	(The ability to specify where default reductions are permitted is
	7255	experimental. More user feedback will help to stabilize it.)
	7256	@end deffn
	7257
	7258	@node LAC
	7259	@subsection LAC
	7260	@findex %define parse.lac
	7261	@cindex LAC
	7262	@cindex lookahead correction
	7263
	7264	Canonical LR, IELR, and LALR can suffer from a couple of problems upon
	7265	encountering a syntax error. First, the parser might perform additional
	7266	parser stack reductions before discovering the syntax error. Such
	7267	reductions can perform user semantic actions that are unexpected because
	7268	they are based on an invalid token, and they cause error recovery to begin
	7269	in a different syntactic context than the one in which the invalid token was
	7270	encountered. Second, when verbose error messages are enabled (@pxref{Error
	7271	Reporting}), the expected token list in the syntax error message can both
	7272	contain invalid tokens and omit valid tokens.
	7273
	7274	The culprits for the above problems are @code{%nonassoc}, default reductions
	7275	in inconsistent states (@pxref{Default Reductions}), and parser state
	7276	merging. Because IELR and LALR merge parser states, they suffer the most.
	7277	Canonical LR can suffer only if @code{%nonassoc} is used or if default
	7278	reductions are enabled for inconsistent states.
	7279
	7280	LAC (Lookahead Correction) is a new mechanism within the parsing algorithm
	7281	that solves these problems for canonical LR, IELR, and LALR without
	7282	sacrificing @code{%nonassoc}, default reductions, or state merging. You can
	7283	enable LAC with the @code{%define parse.lac} directive.
	7284
	7285	@deffn {Directive} {%define parse.lac @var{VALUE}}
	7286	Enable LAC to improve syntax error handling.
	7287	@itemize
	7288	@item @code{none} (default)
	7289	@item @code{full}
	7290	@end itemize
	7291	(This feature is experimental. More user feedback will help to stabilize
	7292	it. Moreover, it is currently only available for deterministic parsers in
	7293	C.)
	7294	@end deffn
	7295
	7296	Conceptually, the LAC mechanism is straight-forward. Whenever the parser
	7297	fetches a new token from the scanner so that it can determine the next
	7298	parser action, it immediately suspends normal parsing and performs an
	7299	exploratory parse using a temporary copy of the normal parser state stack.
	7300	During this exploratory parse, the parser does not perform user semantic
	7301	actions. If the exploratory parse reaches a shift action, normal parsing
	7302	then resumes on the normal parser stacks. If the exploratory parse reaches
	7303	an error instead, the parser reports a syntax error. If verbose syntax
	7304	error messages are enabled, the parser must then discover the list of
	7305	expected tokens, so it performs a separate exploratory parse for each token
	7306	in the grammar.
	7307
	7308	There is one subtlety about the use of LAC. That is, when in a consistent
	7309	parser state with a default reduction, the parser will not attempt to fetch
	7310	a token from the scanner because no lookahead is needed to determine the
	7311	next parser action. Thus, whether default reductions are enabled in
	7312	consistent states (@pxref{Default Reductions}) affects how soon the parser
	7313	detects a syntax error: immediately when it @emph{reaches} an erroneous
	7314	token or when it eventually @emph{needs} that token as a lookahead to
	7315	determine the next parser action. The latter behavior is probably more
	7316	intuitive, so Bison currently provides no way to achieve the former behavior
	7317	while default reductions are enabled in consistent states.
	7318
	7319	Thus, when LAC is in use, for some fixed decision of whether to enable
	7320	default reductions in consistent states, canonical LR and IELR behave almost
	7321	exactly the same for both syntactically acceptable and syntactically
	7322	unacceptable input. While LALR still does not support the full
	7323	language-recognition power of canonical LR and IELR, LAC at least enables
	7324	LALR's syntax error handling to correctly reflect LALR's
	7325	language-recognition power.
	7326
	7327	There are a few caveats to consider when using LAC:
	7328
	7329	@itemize
	7330	@item Infinite parsing loops.
	7331
	7332	IELR plus LAC does have one shortcoming relative to canonical LR. Some
	7333	parsers generated by Bison can loop infinitely. LAC does not fix infinite
	7334	parsing loops that occur between encountering a syntax error and detecting
	7335	it, but enabling canonical LR or disabling default reductions sometimes
	7336	does.
	7337
	7338	@item Verbose error message limitations.
	7339
	7340	Because of internationalization considerations, Bison-generated parsers
	7341	limit the size of the expected token list they are willing to report in a
	7342	verbose syntax error message. If the number of expected tokens exceeds that
	7343	limit, the list is simply dropped from the message. Enabling LAC can
	7344	increase the size of the list and thus cause the parser to drop it. Of
	7345	course, dropping the list is better than reporting an incorrect list.
	7346
	7347	@item Performance.
	7348
	7349	Because LAC requires many parse actions to be performed twice, it can have a
	7350	performance penalty. However, not all parse actions must be performed
	7351	twice. Specifically, during a series of default reductions in consistent
	7352	states and shift actions, the parser never has to initiate an exploratory
	7353	parse. Moreover, the most time-consuming tasks in a parse are often the
	7354	file I/O, the lexical analysis performed by the scanner, and the user's
	7355	semantic actions, but none of these are performed during the exploratory
	7356	parse. Finally, the base of the temporary stack used during an exploratory
	7357	parse is a pointer into the normal parser state stack so that the stack is
	7358	never physically copied. In our experience, the performance penalty of LAC
	7359	has proved insignificant for practical grammars.
	7360	@end itemize
	7361
	7362	While the LAC algorithm shares techniques that have been recognized in the
	7363	parser community for years, for the publication that introduces LAC,
	7364	@pxref{Bibliography,,Denny 2010 May}.
	7365
	7366	@node Unreachable States
	7367	@subsection Unreachable States
	7368	@findex %define lr.keep-unreachable-states
	7369	@cindex unreachable states
	7370
	7371	If there exists no sequence of transitions from the parser's start state to
	7372	some state @var{s}, then Bison considers @var{s} to be an @dfn{unreachable
	7373	state}. A state can become unreachable during conflict resolution if Bison
	7374	disables a shift action leading to it from a predecessor state.
	7375
	7376	By default, Bison removes unreachable states from the parser after conflict
	7377	resolution because they are useless in the generated parser. However,
	7378	keeping unreachable states is sometimes useful when trying to understand the
	7379	relationship between the parser and the grammar.
	7380
	7381	@deffn {Directive} {%define lr.keep-unreachable-states @var{VALUE}}
	7382	Request that Bison allow unreachable states to remain in the parser tables.
	7383	@var{VALUE} must be a Boolean. The default is @code{false}.
	7384	@end deffn
	7385
	7386	There are a few caveats to consider:
	7387
	7388	@itemize @bullet
	7389	@item Missing or extraneous warnings.
	7390
	7391	Unreachable states may contain conflicts and may use rules not used in any
	7392	other state. Thus, keeping unreachable states may induce warnings that are
	7393	irrelevant to your parser's behavior, and it may eliminate warnings that are
	7394	relevant. Of course, the change in warnings may actually be relevant to a
	7395	parser table analysis that wants to keep unreachable states, so this
	7396	behavior will likely remain in future Bison releases.
	7397
	7398	@item Other useless states.
	7399
	7400	While Bison is able to remove unreachable states, it is not guaranteed to
	7401	remove other kinds of useless states. Specifically, when Bison disables
	7402	reduce actions during conflict resolution, some goto actions may become
	7403	useless, and thus some additional states may become useless. If Bison were
	7404	to compute which goto actions were useless and then disable those actions,
	7405	it could identify such states as unreachable and then remove those states.
	7406	However, Bison does not compute which goto actions are useless.
	7407	@end itemize
	7408
	7409	@node Generalized LR Parsing
	7410	@section Generalized LR (GLR) Parsing
	7411	@cindex GLR parsing
	7412	@cindex generalized LR (GLR) parsing
	7413	@cindex ambiguous grammars
	7414	@cindex nondeterministic parsing
	7415
	7416	Bison produces @emph{deterministic} parsers that choose uniquely
	7417	when to reduce and which reduction to apply
	7418	based on a summary of the preceding input and on one extra token of lookahead.
	7419	As a result, normal Bison handles a proper subset of the family of
	7420	context-free languages.
	7421	Ambiguous grammars, since they have strings with more than one possible
	7422	sequence of reductions cannot have deterministic parsers in this sense.
	7423	The same is true of languages that require more than one symbol of
	7424	lookahead, since the parser lacks the information necessary to make a
	7425	decision at the point it must be made in a shift-reduce parser.
	7426	Finally, as previously mentioned (@pxref{Mysterious Conflicts}),
	7427	there are languages where Bison's default choice of how to
	7428	summarize the input seen so far loses necessary information.
	7429
	7430	When you use the @samp{%glr-parser} declaration in your grammar file,
	7431	Bison generates a parser that uses a different algorithm, called
	7432	Generalized LR (or GLR). A Bison GLR
	7433	parser uses the same basic
	7434	algorithm for parsing as an ordinary Bison parser, but behaves
	7435	differently in cases where there is a shift-reduce conflict that has not
	7436	been resolved by precedence rules (@pxref{Precedence}) or a
	7437	reduce-reduce conflict. When a GLR parser encounters such a
	7438	situation, it
	7439	effectively @emph{splits} into a several parsers, one for each possible
	7440	shift or reduction. These parsers then proceed as usual, consuming
	7441	tokens in lock-step. Some of the stacks may encounter other conflicts
	7442	and split further, with the result that instead of a sequence of states,
	7443	a Bison GLR parsing stack is what is in effect a tree of states.
	7444
	7445	In effect, each stack represents a guess as to what the proper parse
	7446	is. Additional input may indicate that a guess was wrong, in which case
	7447	the appropriate stack silently disappears. Otherwise, the semantics
	7448	actions generated in each stack are saved, rather than being executed
	7449	immediately. When a stack disappears, its saved semantic actions never
	7450	get executed. When a reduction causes two stacks to become equivalent,
	7451	their sets of semantic actions are both saved with the state that
	7452	results from the reduction. We say that two stacks are equivalent
	7453	when they both represent the same sequence of states,
	7454	and each pair of corresponding states represents a
	7455	grammar symbol that produces the same segment of the input token
	7456	stream.
	7457
	7458	Whenever the parser makes a transition from having multiple
	7459	states to having one, it reverts to the normal deterministic parsing
	7460	algorithm, after resolving and executing the saved-up actions.
	7461	At this transition, some of the states on the stack will have semantic
	7462	values that are sets (actually multisets) of possible actions. The
	7463	parser tries to pick one of the actions by first finding one whose rule
	7464	has the highest dynamic precedence, as set by the @samp{%dprec}
	7465	declaration. Otherwise, if the alternative actions are not ordered by
	7466	precedence, but there the same merging function is declared for both
	7467	rules by the @samp{%merge} declaration,
	7468	Bison resolves and evaluates both and then calls the merge function on
	7469	the result. Otherwise, it reports an ambiguity.
	7470
	7471	It is possible to use a data structure for the GLR parsing tree that
	7472	permits the processing of any LR(1) grammar in linear time (in the
	7473	size of the input), any unambiguous (not necessarily
	7474	LR(1)) grammar in
	7475	quadratic worst-case time, and any general (possibly ambiguous)
	7476	context-free grammar in cubic worst-case time. However, Bison currently
	7477	uses a simpler data structure that requires time proportional to the
	7478	length of the input times the maximum number of stacks required for any
	7479	prefix of the input. Thus, really ambiguous or nondeterministic
	7480	grammars can require exponential time and space to process. Such badly
	7481	behaving examples, however, are not generally of practical interest.
	7482	Usually, nondeterminism in a grammar is local---the parser is ``in
	7483	doubt'' only for a few tokens at a time. Therefore, the current data
	7484	structure should generally be adequate. On LR(1) portions of a
	7485	grammar, in particular, it is only slightly slower than with the
	7486	deterministic LR(1) Bison parser.
	7487
	7488	For a more detailed exposition of GLR parsers, @pxref{Bibliography,,Scott
	7489	2000}.
	7490
	7491	@node Memory Management
	7492	@section Memory Management, and How to Avoid Memory Exhaustion
	7493	@cindex memory exhaustion
	7494	@cindex memory management
	7495	@cindex stack overflow
	7496	@cindex parser stack overflow
	7497	@cindex overflow of parser stack
	7498
	7499	The Bison parser stack can run out of memory if too many tokens are shifted and
	7500	not reduced. When this happens, the parser function @code{yyparse}
	7501	calls @code{yyerror} and then returns 2.
	7502
	7503	Because Bison parsers have growing stacks, hitting the upper limit
	7504	usually results from using a right recursion instead of a left
	7505	recursion, @xref{Recursion, ,Recursive Rules}.
	7506
	7507	@vindex YYMAXDEPTH
	7508	By defining the macro @code{YYMAXDEPTH}, you can control how deep the
	7509	parser stack can become before memory is exhausted. Define the
	7510	macro with a value that is an integer. This value is the maximum number
	7511	of tokens that can be shifted (and not reduced) before overflow.
	7512
	7513	The stack space allowed is not necessarily allocated. If you specify a
	7514	large value for @code{YYMAXDEPTH}, the parser normally allocates a small
	7515	stack at first, and then makes it bigger by stages as needed. This
	7516	increasing allocation happens automatically and silently. Therefore,
	7517	you do not need to make @code{YYMAXDEPTH} painfully small merely to save
	7518	space for ordinary inputs that do not need much stack.
	7519
	7520	However, do not allow @code{YYMAXDEPTH} to be a value so large that
	7521	arithmetic overflow could occur when calculating the size of the stack
	7522	space. Also, do not allow @code{YYMAXDEPTH} to be less than
	7523	@code{YYINITDEPTH}.
	7524
	7525	@cindex default stack limit
	7526	The default value of @code{YYMAXDEPTH}, if you do not define it, is
	7527	10000.
	7528
	7529	@vindex YYINITDEPTH
	7530	You can control how much stack is allocated initially by defining the
	7531	macro @code{YYINITDEPTH} to a positive integer. For the deterministic
	7532	parser in C, this value must be a compile-time constant
	7533	unless you are assuming C99 or some other target language or compiler
	7534	that allows variable-length arrays. The default is 200.
	7535
	7536	Do not allow @code{YYINITDEPTH} to be greater than @code{YYMAXDEPTH}.
	7537
	7538	@c FIXME: C++ output.
	7539	Because of semantic differences between C and C++, the deterministic
	7540	parsers in C produced by Bison cannot grow when compiled
	7541	by C++ compilers. In this precise case (compiling a C parser as C++) you are
	7542	suggested to grow @code{YYINITDEPTH}. The Bison maintainers hope to fix
	7543	this deficiency in a future release.
	7544
	7545	@node Error Recovery
	7546	@chapter Error Recovery
	7547	@cindex error recovery
	7548	@cindex recovery from errors
	7549
	7550	It is not usually acceptable to have a program terminate on a syntax
	7551	error. For example, a compiler should recover sufficiently to parse the
	7552	rest of the input file and check it for errors; a calculator should accept
	7553	another expression.
	7554
	7555	In a simple interactive command parser where each input is one line, it may
	7556	be sufficient to allow @code{yyparse} to return 1 on error and have the
	7557	caller ignore the rest of the input line when that happens (and then call
	7558	@code{yyparse} again). But this is inadequate for a compiler, because it
	7559	forgets all the syntactic context leading up to the error. A syntax error
	7560	deep within a function in the compiler input should not cause the compiler
	7561	to treat the following line like the beginning of a source file.
	7562
	7563	@findex error
	7564	You can define how to recover from a syntax error by writing rules to
	7565	recognize the special token @code{error}. This is a terminal symbol that
	7566	is always defined (you need not declare it) and reserved for error
	7567	handling. The Bison parser generates an @code{error} token whenever a
	7568	syntax error happens; if you have provided a rule to recognize this token
	7569	in the current context, the parse can continue.
	7570
	7571	For example:
	7572
	7573	@example
	7574	stmts:
	7575	/* empty string */
	7576	\| stmts '\n'
	7577	\| stmts exp '\n'
	7578	\| stmts error '\n'
	7579	@end example
	7580
	7581	The fourth rule in this example says that an error followed by a newline
	7582	makes a valid addition to any @code{stmts}.
	7583
	7584	What happens if a syntax error occurs in the middle of an @code{exp}? The
	7585	error recovery rule, interpreted strictly, applies to the precise sequence
	7586	of a @code{stmts}, an @code{error} and a newline. If an error occurs in
	7587	the middle of an @code{exp}, there will probably be some additional tokens
	7588	and subexpressions on the stack after the last @code{stmts}, and there
	7589	will be tokens to read before the next newline. So the rule is not
	7590	applicable in the ordinary way.
	7591
	7592	But Bison can force the situation to fit the rule, by discarding part of
	7593	the semantic context and part of the input. First it discards states
	7594	and objects from the stack until it gets back to a state in which the
	7595	@code{error} token is acceptable. (This means that the subexpressions
	7596	already parsed are discarded, back to the last complete @code{stmts}.)
	7597	At this point the @code{error} token can be shifted. Then, if the old
	7598	lookahead token is not acceptable to be shifted next, the parser reads
	7599	tokens and discards them until it finds a token which is acceptable. In
	7600	this example, Bison reads and discards input until the next newline so
	7601	that the fourth rule can apply. Note that discarded symbols are
	7602	possible sources of memory leaks, see @ref{Destructor Decl, , Freeing
	7603	Discarded Symbols}, for a means to reclaim this memory.
	7604
	7605	The choice of error rules in the grammar is a choice of strategies for
	7606	error recovery. A simple and useful strategy is simply to skip the rest of
	7607	the current input line or current statement if an error is detected:
	7608
	7609	@example
	7610	stmt: error ';' /* On error, skip until ';' is read. */
	7611	@end example
	7612
	7613	It is also useful to recover to the matching close-delimiter of an
	7614	opening-delimiter that has already been parsed. Otherwise the
	7615	close-delimiter will probably appear to be unmatched, and generate another,
	7616	spurious error message:
	7617
	7618	@example
	7619	primary:
	7620	'(' expr ')'
	7621	\| '(' error ')'
	7622	@dots{}
	7623	;
	7624	@end example
	7625
	7626	Error recovery strategies are necessarily guesses. When they guess wrong,
	7627	one syntax error often leads to another. In the above example, the error
	7628	recovery rule guesses that an error is due to bad input within one
	7629	@code{stmt}. Suppose that instead a spurious semicolon is inserted in the
	7630	middle of a valid @code{stmt}. After the error recovery rule recovers
	7631	from the first error, another syntax error will be found straightaway,
	7632	since the text following the spurious semicolon is also an invalid
	7633	@code{stmt}.
	7634
	7635	To prevent an outpouring of error messages, the parser will output no error
	7636	message for another syntax error that happens shortly after the first; only
	7637	after three consecutive input tokens have been successfully shifted will
	7638	error messages resume.
	7639
	7640	Note that rules which accept the @code{error} token may have actions, just
	7641	as any other rules can.
	7642
	7643	@findex yyerrok
	7644	You can make error messages resume immediately by using the macro
	7645	@code{yyerrok} in an action. If you do this in the error rule's action, no
	7646	error messages will be suppressed. This macro requires no arguments;
	7647	@samp{yyerrok;} is a valid C statement.
	7648
	7649	@findex yyclearin
	7650	The previous lookahead token is reanalyzed immediately after an error. If
	7651	this is unacceptable, then the macro @code{yyclearin} may be used to clear
	7652	this token. Write the statement @samp{yyclearin;} in the error rule's
	7653	action.
	7654	@xref{Action Features, ,Special Features for Use in Actions}.
	7655
	7656	For example, suppose that on a syntax error, an error handling routine is
	7657	called that advances the input stream to some point where parsing should
	7658	once again commence. The next symbol returned by the lexical scanner is
	7659	probably correct. The previous lookahead token ought to be discarded
	7660	with @samp{yyclearin;}.
	7661
	7662	@vindex YYRECOVERING
	7663	The expression @code{YYRECOVERING ()} yields 1 when the parser
	7664	is recovering from a syntax error, and 0 otherwise.
	7665	Syntax error diagnostics are suppressed while recovering from a syntax
	7666	error.
	7667
	7668	@node Context Dependency
	7669	@chapter Handling Context Dependencies
	7670
	7671	The Bison paradigm is to parse tokens first, then group them into larger
	7672	syntactic units. In many languages, the meaning of a token is affected by
	7673	its context. Although this violates the Bison paradigm, certain techniques
	7674	(known as @dfn{kludges}) may enable you to write Bison parsers for such
	7675	languages.
	7676
	7677	@menu
	7678	* Semantic Tokens:: Token parsing can depend on the semantic context.
	7679	* Lexical Tie-ins:: Token parsing can depend on the syntactic context.
	7680	* Tie-in Recovery:: Lexical tie-ins have implications for how
	7681	error recovery rules must be written.
	7682	@end menu
	7683
	7684	(Actually, ``kludge'' means any technique that gets its job done but is
	7685	neither clean nor robust.)
	7686
	7687	@node Semantic Tokens
	7688	@section Semantic Info in Token Types
	7689
	7690	The C language has a context dependency: the way an identifier is used
	7691	depends on what its current meaning is. For example, consider this:
	7692
	7693	@example
	7694	foo (x);
	7695	@end example
	7696
	7697	This looks like a function call statement, but if @code{foo} is a typedef
	7698	name, then this is actually a declaration of @code{x}. How can a Bison
	7699	parser for C decide how to parse this input?
	7700
	7701	The method used in GNU C is to have two different token types,
	7702	@code{IDENTIFIER} and @code{TYPENAME}. When @code{yylex} finds an
	7703	identifier, it looks up the current declaration of the identifier in order
	7704	to decide which token type to return: @code{TYPENAME} if the identifier is
	7705	declared as a typedef, @code{IDENTIFIER} otherwise.
	7706
	7707	The grammar rules can then express the context dependency by the choice of
	7708	token type to recognize. @code{IDENTIFIER} is accepted as an expression,
	7709	but @code{TYPENAME} is not. @code{TYPENAME} can start a declaration, but
	7710	@code{IDENTIFIER} cannot. In contexts where the meaning of the identifier
	7711	is @emph{not} significant, such as in declarations that can shadow a
	7712	typedef name, either @code{TYPENAME} or @code{IDENTIFIER} is
	7713	accepted---there is one rule for each of the two token types.
	7714
	7715	This technique is simple to use if the decision of which kinds of
	7716	identifiers to allow is made at a place close to where the identifier is
	7717	parsed. But in C this is not always so: C allows a declaration to
	7718	redeclare a typedef name provided an explicit type has been specified
	7719	earlier:
	7720
	7721	@example
	7722	typedef int foo, bar;
	7723	int baz (void)
	7724	@group
	7725	@{
	7726	static bar (bar); /* @r{redeclare @code{bar} as static variable} */
	7727	extern foo foo (foo); /* @r{redeclare @code{foo} as function} */
	7728	return foo (bar);
	7729	@}
	7730	@end group
	7731	@end example
	7732
	7733	Unfortunately, the name being declared is separated from the declaration
	7734	construct itself by a complicated syntactic structure---the ``declarator''.
	7735
	7736	As a result, part of the Bison parser for C needs to be duplicated, with
	7737	all the nonterminal names changed: once for parsing a declaration in
	7738	which a typedef name can be redefined, and once for parsing a
	7739	declaration in which that can't be done. Here is a part of the
	7740	duplication, with actions omitted for brevity:
	7741
	7742	@example
	7743	@group
	7744	initdcl:
	7745	declarator maybeasm '=' init
	7746	\| declarator maybeasm
	7747	;
	7748	@end group
	7749
	7750	@group
	7751	notype_initdcl:
	7752	notype_declarator maybeasm '=' init
	7753	\| notype_declarator maybeasm
	7754	;
	7755	@end group
	7756	@end example
	7757
	7758	@noindent
	7759	Here @code{initdcl} can redeclare a typedef name, but @code{notype_initdcl}
	7760	cannot. The distinction between @code{declarator} and
	7761	@code{notype_declarator} is the same sort of thing.
	7762
	7763	There is some similarity between this technique and a lexical tie-in
	7764	(described next), in that information which alters the lexical analysis is
	7765	changed during parsing by other parts of the program. The difference is
	7766	here the information is global, and is used for other purposes in the
	7767	program. A true lexical tie-in has a special-purpose flag controlled by
	7768	the syntactic context.
	7769
	7770	@node Lexical Tie-ins
	7771	@section Lexical Tie-ins
	7772	@cindex lexical tie-in
	7773
	7774	One way to handle context-dependency is the @dfn{lexical tie-in}: a flag
	7775	which is set by Bison actions, whose purpose is to alter the way tokens are
	7776	parsed.
	7777
	7778	For example, suppose we have a language vaguely like C, but with a special
	7779	construct @samp{hex (@var{hex-expr})}. After the keyword @code{hex} comes
	7780	an expression in parentheses in which all integers are hexadecimal. In
	7781	particular, the token @samp{a1b} must be treated as an integer rather than
	7782	as an identifier if it appears in that context. Here is how you can do it:
	7783
	7784	@example
	7785	@group
	7786	%@{
	7787	int hexflag;
	7788	int yylex (void);
	7789	void yyerror (char const *);
	7790	%@}
	7791	%%
	7792	@dots{}
	7793	@end group
	7794	@group
	7795	expr:
	7796	IDENTIFIER
	7797	\| constant
	7798	\| HEX '(' @{ hexflag = 1; @}
	7799	expr ')' @{ hexflag = 0; $$ = $4; @}
	7800	\| expr '+' expr @{ $$ = make_sum ($1, $3); @}
	7801	@dots{}
	7802	;
	7803	@end group
	7804
	7805	@group
	7806	constant:
	7807	INTEGER
	7808	\| STRING
	7809	;
	7810	@end group
	7811	@end example
	7812
	7813	@noindent
	7814	Here we assume that @code{yylex} looks at the value of @code{hexflag}; when
	7815	it is nonzero, all integers are parsed in hexadecimal, and tokens starting
	7816	with letters are parsed as integers if possible.
	7817
	7818	The declaration of @code{hexflag} shown in the prologue of the grammar
	7819	file is needed to make it accessible to the actions (@pxref{Prologue,
	7820	,The Prologue}). You must also write the code in @code{yylex} to obey
	7821	the flag.
	7822
	7823	@node Tie-in Recovery
	7824	@section Lexical Tie-ins and Error Recovery
	7825
	7826	Lexical tie-ins make strict demands on any error recovery rules you have.
	7827	@xref{Error Recovery}.
	7828
	7829	The reason for this is that the purpose of an error recovery rule is to
	7830	abort the parsing of one construct and resume in some larger construct.
	7831	For example, in C-like languages, a typical error recovery rule is to skip
	7832	tokens until the next semicolon, and then start a new statement, like this:
	7833
	7834	@example
	7835	stmt:
	7836	expr ';'
	7837	\| IF '(' expr ')' stmt @{ @dots{} @}
	7838	@dots{}
	7839	\| error ';' @{ hexflag = 0; @}
	7840	;
	7841	@end example
	7842
	7843	If there is a syntax error in the middle of a @samp{hex (@var{expr})}
	7844	construct, this error rule will apply, and then the action for the
	7845	completed @samp{hex (@var{expr})} will never run. So @code{hexflag} would
	7846	remain set for the entire rest of the input, or until the next @code{hex}
	7847	keyword, causing identifiers to be misinterpreted as integers.
	7848
	7849	To avoid this problem the error recovery rule itself clears @code{hexflag}.
	7850
	7851	There may also be an error recovery rule that works within expressions.
	7852	For example, there could be a rule which applies within parentheses
	7853	and skips to the close-parenthesis:
	7854
	7855	@example
	7856	@group
	7857	expr:
	7858	@dots{}
	7859	\| '(' expr ')' @{ $$ = $2; @}
	7860	\| '(' error ')'
	7861	@dots{}
	7862	@end group
	7863	@end example
	7864
	7865	If this rule acts within the @code{hex} construct, it is not going to abort
	7866	that construct (since it applies to an inner level of parentheses within
	7867	the construct). Therefore, it should not clear the flag: the rest of
	7868	the @code{hex} construct should be parsed with the flag still in effect.
	7869
	7870	What if there is an error recovery rule which might abort out of the
	7871	@code{hex} construct or might not, depending on circumstances? There is no
	7872	way you can write the action to determine whether a @code{hex} construct is
	7873	being aborted or not. So if you are using a lexical tie-in, you had better
	7874	make sure your error recovery rules are not of this kind. Each rule must
	7875	be such that you can be sure that it always will, or always won't, have to
	7876	clear the flag.
	7877
	7878	@c ================================================== Debugging Your Parser
	7879
	7880	@node Debugging
	7881	@chapter Debugging Your Parser
	7882
	7883	Developing a parser can be a challenge, especially if you don't
	7884	understand the algorithm (@pxref{Algorithm, ,The Bison Parser
	7885	Algorithm}). Even so, sometimes a detailed description of the automaton
	7886	can help (@pxref{Understanding, , Understanding Your Parser}), or
	7887	tracing the execution of the parser can give some insight on why it
	7888	behaves improperly (@pxref{Tracing, , Tracing Your Parser}).
	7889
	7890	@menu
	7891	* Understanding:: Understanding the structure of your parser.
	7892	* Tracing:: Tracing the execution of your parser.
	7893	@end menu
	7894
	7895	@node Understanding
	7896	@section Understanding Your Parser
	7897
	7898	As documented elsewhere (@pxref{Algorithm, ,The Bison Parser Algorithm})
	7899	Bison parsers are @dfn{shift/reduce automata}. In some cases (much more
	7900	frequent than one would hope), looking at this automaton is required to
	7901	tune or simply fix a parser. Bison provides two different
	7902	representation of it, either textually or graphically (as a DOT file).
	7903
	7904	The textual file is generated when the options @option{--report} or
	7905	@option{--verbose} are specified, see @xref{Invocation, , Invoking
	7906	Bison}. Its name is made by removing @samp{.tab.c} or @samp{.c} from
	7907	the parser implementation file name, and adding @samp{.output}
	7908	instead. Therefore, if the grammar file is @file{foo.y}, then the
	7909	parser implementation file is called @file{foo.tab.c} by default. As
	7910	a consequence, the verbose output file is called @file{foo.output}.
	7911
	7912	The following grammar file, @file{calc.y}, will be used in the sequel:
	7913
	7914	@example
	7915	%token NUM STR
	7916	%left '+' '-'
	7917	%left '*'
	7918	%%
	7919	exp:
	7920	exp '+' exp
	7921	\| exp '-' exp
	7922	\| exp '*' exp
	7923	\| exp '/' exp
	7924	\| NUM
	7925	;
	7926	useless: STR;
	7927	%%
	7928	@end example
	7929
	7930	@command{bison} reports:
	7931
	7932	@example
	7933	calc.y: warning: 1 nonterminal useless in grammar
	7934	calc.y: warning: 1 rule useless in grammar
	7935	calc.y:11.1-7: warning: nonterminal useless in grammar: useless
	7936	calc.y:11.10-12: warning: rule useless in grammar: useless: STR
	7937	calc.y: conflicts: 7 shift/reduce
	7938	@end example
	7939
	7940	When given @option{--report=state}, in addition to @file{calc.tab.c}, it
	7941	creates a file @file{calc.output} with contents detailed below. The
	7942	order of the output and the exact presentation might vary, but the
	7943	interpretation is the same.
	7944
	7945	@noindent
	7946	@cindex token, useless
	7947	@cindex useless token
	7948	@cindex nonterminal, useless
	7949	@cindex useless nonterminal
	7950	@cindex rule, useless
	7951	@cindex useless rule
	7952	The first section reports useless tokens, nonterminals and rules. Useless
	7953	nonterminals and rules are removed in order to produce a smaller parser, but
	7954	useless tokens are preserved, since they might be used by the scanner (note
	7955	the difference between ``useless'' and ``unused'' below):
	7956
	7957	@example
	7958	Nonterminals useless in grammar
	7959	useless
	7960
	7961	Terminals unused in grammar
	7962	STR
	7963
	7964	Rules useless in grammar
	7965	6 useless: STR
	7966	@end example
	7967
	7968	@noindent
	7969	The next section lists states that still have conflicts.
	7970
	7971	@example
	7972	State 8 conflicts: 1 shift/reduce
	7973	State 9 conflicts: 1 shift/reduce
	7974	State 10 conflicts: 1 shift/reduce
	7975	State 11 conflicts: 4 shift/reduce
	7976	@end example
	7977
	7978	@noindent
	7979	Then Bison reproduces the exact grammar it used:
	7980
	7981	@example
	7982	Grammar
	7983
	7984	0 $accept: exp $end
	7985
	7986	1 exp: exp '+' exp
	7987	2 \| exp '-' exp
	7988	3 \| exp '*' exp
	7989	4 \| exp '/' exp
	7990	5 \| NUM
	7991	@end example
	7992
	7993	@noindent
	7994	and reports the uses of the symbols:
	7995
	7996	@example
	7997	@group
	7998	Terminals, with rules where they appear
	7999
	8000	$end (0) 0
	8001	'*' (42) 3
	8002	'+' (43) 1
	8003	'-' (45) 2
	8004	'/' (47) 4
	8005	error (256)
	8006	NUM (258) 5
	8007	STR (259)
	8008	@end group
	8009
	8010	@group
	8011	Nonterminals, with rules where they appear
	8012
	8013	$accept (9)
	8014	on left: 0
	8015	exp (10)
	8016	on left: 1 2 3 4 5, on right: 0 1 2 3 4
	8017	@end group
	8018	@end example
	8019
	8020	@noindent
	8021	@cindex item
	8022	@cindex pointed rule
	8023	@cindex rule, pointed
	8024	Bison then proceeds onto the automaton itself, describing each state
	8025	with its set of @dfn{items}, also known as @dfn{pointed rules}. Each
	8026	item is a production rule together with a point (@samp{.}) marking
	8027	the location of the input cursor.
	8028
	8029	@example
	8030	state 0
	8031
	8032	0 $accept: . exp $end
	8033
	8034	NUM shift, and go to state 1
	8035
	8036	exp go to state 2
	8037	@end example
	8038
	8039	This reads as follows: ``state 0 corresponds to being at the very
	8040	beginning of the parsing, in the initial rule, right before the start
	8041	symbol (here, @code{exp}). When the parser returns to this state right
	8042	after having reduced a rule that produced an @code{exp}, the control
	8043	flow jumps to state 2. If there is no such transition on a nonterminal
	8044	symbol, and the lookahead is a @code{NUM}, then this token is shifted onto
	8045	the parse stack, and the control flow jumps to state 1. Any other
	8046	lookahead triggers a syntax error.''
	8047
	8048	@cindex core, item set
	8049	@cindex item set core
	8050	@cindex kernel, item set
	8051	@cindex item set core
	8052	Even though the only active rule in state 0 seems to be rule 0, the
	8053	report lists @code{NUM} as a lookahead token because @code{NUM} can be
	8054	at the beginning of any rule deriving an @code{exp}. By default Bison
	8055	reports the so-called @dfn{core} or @dfn{kernel} of the item set, but if
	8056	you want to see more detail you can invoke @command{bison} with
	8057	@option{--report=itemset} to list the derived items as well:
	8058
	8059	@example
	8060	state 0
	8061
	8062	0 $accept: . exp $end
	8063	1 exp: . exp '+' exp
	8064	2 \| . exp '-' exp
	8065	3 \| . exp '*' exp
	8066	4 \| . exp '/' exp
	8067	5 \| . NUM
	8068
	8069	NUM shift, and go to state 1
	8070
	8071	exp go to state 2
	8072	@end example
	8073
	8074	@noindent
	8075	In the state 1@dots{}
	8076
	8077	@example
	8078	state 1
	8079
	8080	5 exp: NUM .
	8081
	8082	$default reduce using rule 5 (exp)
	8083	@end example
	8084
	8085	@noindent
	8086	the rule 5, @samp{exp: NUM;}, is completed. Whatever the lookahead token
	8087	(@samp{$default}), the parser will reduce it. If it was coming from
	8088	state 0, then, after this reduction it will return to state 0, and will
	8089	jump to state 2 (@samp{exp: go to state 2}).
	8090
	8091	@example
	8092	state 2
	8093
	8094	0 $accept: exp . $end
	8095	1 exp: exp . '+' exp
	8096	2 \| exp . '-' exp
	8097	3 \| exp . '*' exp
	8098	4 \| exp . '/' exp
	8099
	8100	$end shift, and go to state 3
	8101	'+' shift, and go to state 4
	8102	'-' shift, and go to state 5
	8103	'*' shift, and go to state 6
	8104	'/' shift, and go to state 7
	8105	@end example
	8106
	8107	@noindent
	8108	In state 2, the automaton can only shift a symbol. For instance,
	8109	because of the item @samp{exp: exp . '+' exp}, if the lookahead is
	8110	@samp{+} it is shifted onto the parse stack, and the automaton
	8111	jumps to state 4, corresponding to the item @samp{exp: exp '+' . exp}.
	8112	Since there is no default action, any lookahead not listed triggers a syntax
	8113	error.
	8114
	8115	@cindex accepting state
	8116	The state 3 is named the @dfn{final state}, or the @dfn{accepting
	8117	state}:
	8118
	8119	@example
	8120	state 3
	8121
	8122	0 $accept: exp $end .
	8123
	8124	$default accept
	8125	@end example
	8126
	8127	@noindent
	8128	the initial rule is completed (the start symbol and the end-of-input were
	8129	read), the parsing exits successfully.
	8130
	8131	The interpretation of states 4 to 7 is straightforward, and is left to
	8132	the reader.
	8133
	8134	@example
	8135	state 4
	8136
	8137	1 exp: exp '+' . exp
	8138
	8139	NUM shift, and go to state 1
	8140
	8141	exp go to state 8
	8142
	8143
	8144	state 5
	8145
	8146	2 exp: exp '-' . exp
	8147
	8148	NUM shift, and go to state 1
	8149
	8150	exp go to state 9
	8151
	8152
	8153	state 6
	8154
	8155	3 exp: exp '*' . exp
	8156
	8157	NUM shift, and go to state 1
	8158
	8159	exp go to state 10
	8160
	8161
	8162	state 7
	8163
	8164	4 exp: exp '/' . exp
	8165
	8166	NUM shift, and go to state 1
	8167
	8168	exp go to state 11
	8169	@end example
	8170
	8171	As was announced in beginning of the report, @samp{State 8 conflicts:
	8172	1 shift/reduce}:
	8173
	8174	@example
	8175	state 8
	8176
	8177	1 exp: exp . '+' exp
	8178	1 \| exp '+' exp .
	8179	2 \| exp . '-' exp
	8180	3 \| exp . '*' exp
	8181	4 \| exp . '/' exp
	8182
	8183	'*' shift, and go to state 6
	8184	'/' shift, and go to state 7
	8185
	8186	'/' [reduce using rule 1 (exp)]
	8187	$default reduce using rule 1 (exp)
	8188	@end example
	8189
	8190	Indeed, there are two actions associated to the lookahead @samp{/}:
	8191	either shifting (and going to state 7), or reducing rule 1. The
	8192	conflict means that either the grammar is ambiguous, or the parser lacks
	8193	information to make the right decision. Indeed the grammar is
	8194	ambiguous, as, since we did not specify the precedence of @samp{/}, the
	8195	sentence @samp{NUM + NUM / NUM} can be parsed as @samp{NUM + (NUM /
	8196	NUM)}, which corresponds to shifting @samp{/}, or as @samp{(NUM + NUM) /
	8197	NUM}, which corresponds to reducing rule 1.
	8198
	8199	Because in deterministic parsing a single decision can be made, Bison
	8200	arbitrarily chose to disable the reduction, see @ref{Shift/Reduce, ,
	8201	Shift/Reduce Conflicts}. Discarded actions are reported between
	8202	square brackets.
	8203
	8204	Note that all the previous states had a single possible action: either
	8205	shifting the next token and going to the corresponding state, or
	8206	reducing a single rule. In the other cases, i.e., when shifting
	8207	@emph{and} reducing is possible or when @emph{several} reductions are
	8208	possible, the lookahead is required to select the action. State 8 is
	8209	one such state: if the lookahead is @samp{*} or @samp{/} then the action
	8210	is shifting, otherwise the action is reducing rule 1. In other words,
	8211	the first two items, corresponding to rule 1, are not eligible when the
	8212	lookahead token is @samp{}, since we specified that @samp{} has higher
	8213	precedence than @samp{+}. More generally, some items are eligible only
	8214	with some set of possible lookahead tokens. When run with
	8215	@option{--report=lookahead}, Bison specifies these lookahead tokens:
	8216
	8217	@example
	8218	state 8
	8219
	8220	1 exp: exp . '+' exp
	8221	1 \| exp '+' exp . [$end, '+', '-', '/']
	8222	2 \| exp . '-' exp
	8223	3 \| exp . '*' exp
	8224	4 \| exp . '/' exp
	8225
	8226	'*' shift, and go to state 6
	8227	'/' shift, and go to state 7
	8228
	8229	'/' [reduce using rule 1 (exp)]
	8230	$default reduce using rule 1 (exp)
	8231	@end example
	8232
	8233	Note however that while @samp{NUM + NUM / NUM} is ambiguous (which results in
	8234	the conflicts on @samp{/}), @samp{NUM + NUM * NUM} is not: the conflict was
	8235	solved thanks to associativity and precedence directives. If invoked with
	8236	@option{--report=solved}, Bison includes information about the solved
	8237	conflicts in the report:
	8238
	8239	@example
	8240	Conflict between rule 1 and token '+' resolved as reduce (%left '+').
	8241	Conflict between rule 1 and token '-' resolved as reduce (%left '-').
	8242	Conflict between rule 1 and token '' resolved as shift ('+' < '').
	8243	@end example
	8244
	8245
	8246	The remaining states are similar:
	8247
	8248	@example
	8249	@group
	8250	state 9
	8251
	8252	1 exp: exp . '+' exp
	8253	2 \| exp . '-' exp
	8254	2 \| exp '-' exp .
	8255	3 \| exp . '*' exp
	8256	4 \| exp . '/' exp
	8257
	8258	'*' shift, and go to state 6
	8259	'/' shift, and go to state 7
	8260
	8261	'/' [reduce using rule 2 (exp)]
	8262	$default reduce using rule 2 (exp)
	8263	@end group
	8264
	8265	@group
	8266	state 10
	8267
	8268	1 exp: exp . '+' exp
	8269	2 \| exp . '-' exp
	8270	3 \| exp . '*' exp
	8271	3 \| exp '*' exp .
	8272	4 \| exp . '/' exp
	8273
	8274	'/' shift, and go to state 7
	8275
	8276	'/' [reduce using rule 3 (exp)]
	8277	$default reduce using rule 3 (exp)
	8278	@end group
	8279
	8280	@group
	8281	state 11
	8282
	8283	1 exp: exp . '+' exp
	8284	2 \| exp . '-' exp
	8285	3 \| exp . '*' exp
	8286	4 \| exp . '/' exp
	8287	4 \| exp '/' exp .
	8288
	8289	'+' shift, and go to state 4
	8290	'-' shift, and go to state 5
	8291	'*' shift, and go to state 6
	8292	'/' shift, and go to state 7
	8293
	8294	'+' [reduce using rule 4 (exp)]
	8295	'-' [reduce using rule 4 (exp)]
	8296	'*' [reduce using rule 4 (exp)]
	8297	'/' [reduce using rule 4 (exp)]
	8298	$default reduce using rule 4 (exp)
	8299	@end group
	8300	@end example
	8301
	8302	@noindent
	8303	Observe that state 11 contains conflicts not only due to the lack of
	8304	precedence of @samp{/} with respect to @samp{+}, @samp{-}, and
	8305	@samp{*}, but also because the
	8306	associativity of @samp{/} is not specified.
	8307
	8308
	8309	@node Tracing
	8310	@section Tracing Your Parser
	8311	@findex yydebug
	8312	@cindex debugging
	8313	@cindex tracing the parser
	8314
	8315	If a Bison grammar compiles properly but doesn't do what you want when it
	8316	runs, the @code{yydebug} parser-trace feature can help you figure out why.
	8317
	8318	There are several means to enable compilation of trace facilities:
	8319
	8320	@table @asis
	8321	@item the macro @code{YYDEBUG}
	8322	@findex YYDEBUG
	8323	Define the macro @code{YYDEBUG} to a nonzero value when you compile the
	8324	parser. This is compliant with POSIX Yacc. You could use
	8325	@samp{-DYYDEBUG=1} as a compiler option or you could put @samp{#define
	8326	YYDEBUG 1} in the prologue of the grammar file (@pxref{Prologue, , The
	8327	Prologue}).
	8328
	8329	@item the option @option{-t}, @option{--debug}
	8330	Use the @samp{-t} option when you run Bison (@pxref{Invocation,
	8331	,Invoking Bison}). This is POSIX compliant too.
	8332
	8333	@item the directive @samp{%debug}
	8334	@findex %debug
	8335	Add the @code{%debug} directive (@pxref{Decl Summary, ,Bison
	8336	Declaration Summary}). This is a Bison extension, which will prove
	8337	useful when Bison will output parsers for languages that don't use a
	8338	preprocessor. Unless POSIX and Yacc portability matter to
	8339	you, this is
	8340	the preferred solution.
	8341	@end table
	8342
	8343	We suggest that you always enable the debug option so that debugging is
	8344	always possible.
	8345
	8346	The trace facility outputs messages with macro calls of the form
	8347	@code{YYFPRINTF (stderr, @var{format}, @var{args})} where
	8348	@var{format} and @var{args} are the usual @code{printf} format and variadic
	8349	arguments. If you define @code{YYDEBUG} to a nonzero value but do not
	8350	define @code{YYFPRINTF}, @code{<stdio.h>} is automatically included
	8351	and @code{YYFPRINTF} is defined to @code{fprintf}.
	8352
	8353	Once you have compiled the program with trace facilities, the way to
	8354	request a trace is to store a nonzero value in the variable @code{yydebug}.
	8355	You can do this by making the C code do it (in @code{main}, perhaps), or
	8356	you can alter the value with a C debugger.
	8357
	8358	Each step taken by the parser when @code{yydebug} is nonzero produces a
	8359	line or two of trace information, written on @code{stderr}. The trace
	8360	messages tell you these things:
	8361
	8362	@itemize @bullet
	8363	@item
	8364	Each time the parser calls @code{yylex}, what kind of token was read.
	8365
	8366	@item
	8367	Each time a token is shifted, the depth and complete contents of the
	8368	state stack (@pxref{Parser States}).
	8369
	8370	@item
	8371	Each time a rule is reduced, which rule it is, and the complete contents
	8372	of the state stack afterward.
	8373	@end itemize
	8374
	8375	To make sense of this information, it helps to refer to the listing file
	8376	produced by the Bison @samp{-v} option (@pxref{Invocation, ,Invoking
	8377	Bison}). This file shows the meaning of each state in terms of
	8378	positions in various rules, and also what each state will do with each
	8379	possible input token. As you read the successive trace messages, you
	8380	can see that the parser is functioning according to its specification in
	8381	the listing file. Eventually you will arrive at the place where
	8382	something undesirable happens, and you will see which parts of the
	8383	grammar are to blame.
	8384
	8385	The parser implementation file is a C program and you can use C
	8386	debuggers on it, but it's not easy to interpret what it is doing. The
	8387	parser function is a finite-state machine interpreter, and aside from
	8388	the actions it executes the same code over and over. Only the values
	8389	of variables show where in the grammar it is working.
	8390
	8391	@findex YYPRINT
	8392	The debugging information normally gives the token type of each token
	8393	read, but not its semantic value. You can optionally define a macro
	8394	named @code{YYPRINT} to provide a way to print the value. If you define
	8395	@code{YYPRINT}, it should take three arguments. The parser will pass a
	8396	standard I/O stream, the numeric code for the token type, and the token
	8397	value (from @code{yylval}).
	8398
	8399	Here is an example of @code{YYPRINT} suitable for the multi-function
	8400	calculator (@pxref{Mfcalc Declarations, ,Declarations for @code{mfcalc}}):
	8401
	8402	@example
	8403	%@{
	8404	static void print_token_value (FILE *, int, YYSTYPE);
	8405	#define YYPRINT(file, type, value) \
	8406	print_token_value (file, type, value)
	8407	%@}
	8408
	8409	@dots{} %% @dots{} %% @dots{}
	8410
	8411	static void
	8412	print_token_value (FILE *file, int type, YYSTYPE value)
	8413	@{
	8414	if (type == VAR)
	8415	fprintf (file, "%s", value.tptr->name);
	8416	else if (type == NUM)
	8417	fprintf (file, "%d", value.val);
	8418	@}
	8419	@end example
	8420
	8421	@c ================================================= Invoking Bison
	8422
	8423	@node Invocation
	8424	@chapter Invoking Bison
	8425	@cindex invoking Bison
	8426	@cindex Bison invocation
	8427	@cindex options for invoking Bison
	8428
	8429	The usual way to invoke Bison is as follows:
	8430
	8431	@example
	8432	bison @var{infile}
	8433	@end example
	8434
	8435	Here @var{infile} is the grammar file name, which usually ends in
	8436	@samp{.y}. The parser implementation file's name is made by replacing
	8437	the @samp{.y} with @samp{.tab.c} and removing any leading directory.
	8438	Thus, the @samp{bison foo.y} file name yields @file{foo.tab.c}, and
	8439	the @samp{bison hack/foo.y} file name yields @file{foo.tab.c}. It's
	8440	also possible, in case you are writing C++ code instead of C in your
	8441	grammar file, to name it @file{foo.ypp} or @file{foo.y++}. Then, the
	8442	output files will take an extension like the given one as input
	8443	(respectively @file{foo.tab.cpp} and @file{foo.tab.c++}). This
	8444	feature takes effect with all options that manipulate file names like
	8445	@samp{-o} or @samp{-d}.
	8446
	8447	For example :
	8448
	8449	@example
	8450	bison -d @var{infile.yxx}
	8451	@end example
	8452	@noindent
	8453	will produce @file{infile.tab.cxx} and @file{infile.tab.hxx}, and
	8454
	8455	@example
	8456	bison -d -o @var{output.c++} @var{infile.y}
	8457	@end example
	8458	@noindent
	8459	will produce @file{output.c++} and @file{outfile.h++}.
	8460
	8461	For compatibility with POSIX, the standard Bison
	8462	distribution also contains a shell script called @command{yacc} that
	8463	invokes Bison with the @option{-y} option.
	8464
	8465	@menu
	8466	* Bison Options:: All the options described in detail,
	8467	in alphabetical order by short options.
	8468	* Option Cross Key:: Alphabetical list of long options.
	8469	* Yacc Library:: Yacc-compatible @code{yylex} and @code{main}.
	8470	@end menu
	8471
	8472	@node Bison Options
	8473	@section Bison Options
	8474
	8475	Bison supports both traditional single-letter options and mnemonic long
	8476	option names. Long option names are indicated with @samp{--} instead of
	8477	@samp{-}. Abbreviations for option names are allowed as long as they
	8478	are unique. When a long option takes an argument, like
	8479	@samp{--file-prefix}, connect the option name and the argument with
	8480	@samp{=}.
	8481
	8482	Here is a list of options that can be used with Bison, alphabetized by
	8483	short option. It is followed by a cross key alphabetized by long
	8484	option.
	8485
	8486	@c Please, keep this ordered as in `bison --help'.
	8487	@noindent
	8488	Operations modes:
	8489	@table @option
	8490	@item -h
	8491	@itemx --help
	8492	Print a summary of the command-line options to Bison and exit.
	8493
	8494	@item -V
	8495	@itemx --version
	8496	Print the version number of Bison and exit.
	8497
	8498	@item --print-localedir
	8499	Print the name of the directory containing locale-dependent data.
	8500
	8501	@item --print-datadir
	8502	Print the name of the directory containing skeletons and XSLT.
	8503
	8504	@item -y
	8505	@itemx --yacc
	8506	Act more like the traditional Yacc command. This can cause different
	8507	diagnostics to be generated, and may change behavior in other minor
	8508	ways. Most importantly, imitate Yacc's output file name conventions,
	8509	so that the parser implementation file is called @file{y.tab.c}, and
	8510	the other outputs are called @file{y.output} and @file{y.tab.h}.
	8511	Also, if generating a deterministic parser in C, generate
	8512	@code{#define} statements in addition to an @code{enum} to associate
	8513	token numbers with token names. Thus, the following shell script can
	8514	substitute for Yacc, and the Bison distribution contains such a script
	8515	for compatibility with POSIX:
	8516
	8517	@example
	8518	#! /bin/sh
	8519	bison -y "$@@"
	8520	@end example
	8521
	8522	The @option{-y}/@option{--yacc} option is intended for use with
	8523	traditional Yacc grammars. If your grammar uses a Bison extension
	8524	like @samp{%glr-parser}, Bison might not be Yacc-compatible even if
	8525	this option is specified.
	8526
	8527	@item -W [@var{category}]
	8528	@itemx --warnings[=@var{category}]
	8529	Output warnings falling in @var{category}. @var{category} can be one
	8530	of:
	8531	@table @code
	8532	@item midrule-values
	8533	Warn about mid-rule values that are set but not used within any of the actions
	8534	of the parent rule.
	8535	For example, warn about unused @code{$2} in:
	8536
	8537	@example
	8538	exp: '1' @{ $$ = 1; @} '+' exp @{ $$ = $1 + $4; @};
	8539	@end example
	8540
	8541	Also warn about mid-rule values that are used but not set.
	8542	For example, warn about unset @code{$$} in the mid-rule action in:
	8543
	8544	@example
	8545	exp: '1' @{ $1 = 1; @} '+' exp @{ $$ = $2 + $4; @};
	8546	@end example
	8547
	8548	These warnings are not enabled by default since they sometimes prove to
	8549	be false alarms in existing grammars employing the Yacc constructs
	8550	@code{$0} or @code{$-@var{n}} (where @var{n} is some positive integer).
	8551
	8552	@item yacc
	8553	Incompatibilities with POSIX Yacc.
	8554
	8555	@item conflicts-sr
	8556	@itemx conflicts-rr
	8557	S/R and R/R conflicts. These warnings are enabled by default. However, if
	8558	the @code{%expect} or @code{%expect-rr} directive is specified, an
	8559	unexpected number of conflicts is an error, and an expected number of
	8560	conflicts is not reported, so @option{-W} and @option{--warning} then have
	8561	no effect on the conflict report.
	8562
	8563	@item other
	8564	All warnings not categorized above. These warnings are enabled by default.
	8565
	8566	This category is provided merely for the sake of completeness. Future
	8567	releases of Bison may move warnings from this category to new, more specific
	8568	categories.
	8569
	8570	@item all
	8571	All the warnings.
	8572	@item none
	8573	Turn off all the warnings.
	8574	@item error
	8575	Treat warnings as errors.
	8576	@end table
	8577
	8578	A category can be turned off by prefixing its name with @samp{no-}. For
	8579	instance, @option{-Wno-yacc} will hide the warnings about
	8580	POSIX Yacc incompatibilities.
	8581	@end table
	8582
	8583	@noindent
	8584	Tuning the parser:
	8585
	8586	@table @option
	8587	@item -t
	8588	@itemx --debug
	8589	In the parser implementation file, define the macro @code{YYDEBUG} to
	8590	1 if it is not already defined, so that the debugging facilities are
	8591	compiled. @xref{Tracing, ,Tracing Your Parser}.
	8592
	8593	@item -D @var{name}[=@var{value}]
	8594	@itemx --define=@var{name}[=@var{value}]
	8595	@itemx -F @var{name}[=@var{value}]
	8596	@itemx --force-define=@var{name}[=@var{value}]
	8597	Each of these is equivalent to @samp{%define @var{name} "@var{value}"}
	8598	(@pxref{%define Summary}) except that Bison processes multiple
	8599	definitions for the same @var{name} as follows:
	8600
	8601	@itemize
	8602	@item
	8603	Bison quietly ignores all command-line definitions for @var{name} except
	8604	the last.
	8605	@item
	8606	If that command-line definition is specified by a @code{-D} or
	8607	@code{--define}, Bison reports an error for any @code{%define}
	8608	definition for @var{name}.
	8609	@item
	8610	If that command-line definition is specified by a @code{-F} or
	8611	@code{--force-define} instead, Bison quietly ignores all @code{%define}
	8612	definitions for @var{name}.
	8613	@item
	8614	Otherwise, Bison reports an error if there are multiple @code{%define}
	8615	definitions for @var{name}.
	8616	@end itemize
	8617
	8618	You should avoid using @code{-F} and @code{--force-define} in your
	8619	make files unless you are confident that it is safe to quietly ignore
	8620	any conflicting @code{%define} that may be added to the grammar file.
	8621
	8622	@item -L @var{language}
	8623	@itemx --language=@var{language}
	8624	Specify the programming language for the generated parser, as if
	8625	@code{%language} was specified (@pxref{Decl Summary, , Bison Declaration
	8626	Summary}). Currently supported languages include C, C++, and Java.
	8627	@var{language} is case-insensitive.
	8628
	8629	This option is experimental and its effect may be modified in future
	8630	releases.
	8631
	8632	@item --locations
	8633	Pretend that @code{%locations} was specified. @xref{Decl Summary}.
	8634
	8635	@item -p @var{prefix}
	8636	@itemx --name-prefix=@var{prefix}
	8637	Pretend that @code{%name-prefix "@var{prefix}"} was specified.
	8638	@xref{Decl Summary}.
	8639
	8640	@item -l
	8641	@itemx --no-lines
	8642	Don't put any @code{#line} preprocessor commands in the parser
	8643	implementation file. Ordinarily Bison puts them in the parser
	8644	implementation file so that the C compiler and debuggers will
	8645	associate errors with your source file, the grammar file. This option
	8646	causes them to associate errors with the parser implementation file,
	8647	treating it as an independent source file in its own right.
	8648
	8649	@item -S @var{file}
	8650	@itemx --skeleton=@var{file}
	8651	Specify the skeleton to use, similar to @code{%skeleton}
	8652	(@pxref{Decl Summary, , Bison Declaration Summary}).
	8653
	8654	@c You probably don't need this option unless you are developing Bison.
	8655	@c You should use @option{--language} if you want to specify the skeleton for a
	8656	@c different language, because it is clearer and because it will always
	8657	@c choose the correct skeleton for non-deterministic or push parsers.
	8658
	8659	If @var{file} does not contain a @code{/}, @var{file} is the name of a skeleton
	8660	file in the Bison installation directory.
	8661	If it does, @var{file} is an absolute file name or a file name relative to the
	8662	current working directory.
	8663	This is similar to how most shells resolve commands.
	8664
	8665	@item -k
	8666	@itemx --token-table
	8667	Pretend that @code{%token-table} was specified. @xref{Decl Summary}.
	8668	@end table
	8669
	8670	@noindent
	8671	Adjust the output:
	8672
	8673	@table @option
	8674	@item --defines[=@var{file}]
	8675	Pretend that @code{%defines} was specified, i.e., write an extra output
	8676	file containing macro definitions for the token type names defined in
	8677	the grammar, as well as a few other declarations. @xref{Decl Summary}.
	8678
	8679	@item -d
	8680	This is the same as @code{--defines} except @code{-d} does not accept a
	8681	@var{file} argument since POSIX Yacc requires that @code{-d} can be bundled
	8682	with other short options.
	8683
	8684	@item -b @var{file-prefix}
	8685	@itemx --file-prefix=@var{prefix}
	8686	Pretend that @code{%file-prefix} was specified, i.e., specify prefix to use
	8687	for all Bison output file names. @xref{Decl Summary}.
	8688
	8689	@item -r @var{things}
	8690	@itemx --report=@var{things}
	8691	Write an extra output file containing verbose description of the comma
	8692	separated list of @var{things} among:
	8693
	8694	@table @code
	8695	@item state
	8696	Description of the grammar, conflicts (resolved and unresolved), and
	8697	parser's automaton.
	8698
	8699	@item lookahead
	8700	Implies @code{state} and augments the description of the automaton with
	8701	each rule's lookahead set.
	8702
	8703	@item itemset
	8704	Implies @code{state} and augments the description of the automaton with
	8705	the full set of items for each state, instead of its core only.
	8706	@end table
	8707
	8708	@item --report-file=@var{file}
	8709	Specify the @var{file} for the verbose description.
	8710
	8711	@item -v
	8712	@itemx --verbose
	8713	Pretend that @code{%verbose} was specified, i.e., write an extra output
	8714	file containing verbose descriptions of the grammar and
	8715	parser. @xref{Decl Summary}.
	8716
	8717	@item -o @var{file}
	8718	@itemx --output=@var{file}
	8719	Specify the @var{file} for the parser implementation file.
	8720
	8721	The other output files' names are constructed from @var{file} as
	8722	described under the @samp{-v} and @samp{-d} options.
	8723
	8724	@item -g [@var{file}]
	8725	@itemx --graph[=@var{file}]
	8726	Output a graphical representation of the parser's
	8727	automaton computed by Bison, in @uref{http://www.graphviz.org/, Graphviz}
	8728	@uref{http://www.graphviz.org/doc/info/lang.html, DOT} format.
	8729	@code{@var{file}} is optional.
	8730	If omitted and the grammar file is @file{foo.y}, the output file will be
	8731	@file{foo.dot}.
	8732
	8733	@item -x [@var{file}]
	8734	@itemx --xml[=@var{file}]
	8735	Output an XML report of the parser's automaton computed by Bison.
	8736	@code{@var{file}} is optional.
	8737	If omitted and the grammar file is @file{foo.y}, the output file will be
	8738	@file{foo.xml}.
	8739	(The current XML schema is experimental and may evolve.
	8740	More user feedback will help to stabilize it.)
	8741	@end table
	8742
	8743	@node Option Cross Key
	8744	@section Option Cross Key
	8745
	8746	Here is a list of options, alphabetized by long option, to help you find
	8747	the corresponding short option and directive.
	8748
	8749	@multitable {@option{--force-define=@var{name}[=@var{value}]}} {@option{-F @var{name}[=@var{value}]}} {@code{%nondeterministic-parser}}
	8750	@headitem Long Option @tab Short Option @tab Bison Directive
	8751	@include cross-options.texi
	8752	@end multitable
	8753
	8754	@node Yacc Library
	8755	@section Yacc Library
	8756
	8757	The Yacc library contains default implementations of the
	8758	@code{yyerror} and @code{main} functions. These default
	8759	implementations are normally not useful, but POSIX requires
	8760	them. To use the Yacc library, link your program with the
	8761	@option{-ly} option. Note that Bison's implementation of the Yacc
	8762	library is distributed under the terms of the GNU General
	8763	Public License (@pxref{Copying}).
	8764
	8765	If you use the Yacc library's @code{yyerror} function, you should
	8766	declare @code{yyerror} as follows:
	8767
	8768	@example
	8769	int yyerror (char const *);
	8770	@end example
	8771
	8772	Bison ignores the @code{int} value returned by this @code{yyerror}.
	8773	If you use the Yacc library's @code{main} function, your
	8774	@code{yyparse} function should have the following type signature:
	8775
	8776	@example
	8777	int yyparse (void);
	8778	@end example
	8779
	8780	@c ================================================= C++ Bison
	8781
	8782	@node Other Languages
	8783	@chapter Parsers Written In Other Languages
	8784
	8785	@menu
	8786	* C++ Parsers:: The interface to generate C++ parser classes
	8787	* Java Parsers:: The interface to generate Java parser classes
	8788	@end menu
	8789
	8790	@node C++ Parsers
	8791	@section C++ Parsers
	8792
	8793	@menu
	8794	* C++ Bison Interface:: Asking for C++ parser generation
	8795	* C++ Semantic Values:: %union vs. C++
	8796	* C++ Location Values:: The position and location classes
	8797	* C++ Parser Interface:: Instantiating and running the parser
	8798	* C++ Scanner Interface:: Exchanges between yylex and parse
	8799	* A Complete C++ Example:: Demonstrating their use
	8800	@end menu
	8801
	8802	@node C++ Bison Interface
	8803	@subsection C++ Bison Interface
	8804	@c - %skeleton "lalr1.cc"
	8805	@c - Always pure
	8806	@c - initial action
	8807
	8808	The C++ deterministic parser is selected using the skeleton directive,
	8809	@samp{%skeleton "lalr1.cc"}, or the synonymous command-line option
	8810	@option{--skeleton=lalr1.cc}.
	8811	@xref{Decl Summary}.
	8812
	8813	When run, @command{bison} will create several entities in the @samp{yy}
	8814	namespace.
	8815	@findex %define namespace
	8816	Use the @samp{%define namespace} directive to change the namespace
	8817	name, see @ref{%define Summary,,namespace}. The various classes are
	8818	generated in the following files:
	8819
	8820	@table @file
	8821	@item position.hh
	8822	@itemx location.hh
	8823	The definition of the classes @code{position} and @code{location},
	8824	used for location tracking. @xref{C++ Location Values}.
	8825
	8826	@item stack.hh
	8827	An auxiliary class @code{stack} used by the parser.
	8828
	8829	@item @var{file}.hh
	8830	@itemx @var{file}.cc
	8831	(Assuming the extension of the grammar file was @samp{.yy}.) The
	8832	declaration and implementation of the C++ parser class. The basename
	8833	and extension of these two files follow the same rules as with regular C
	8834	parsers (@pxref{Invocation}).
	8835
	8836	The header is @emph{mandatory}; you must either pass
	8837	@option{-d}/@option{--defines} to @command{bison}, or use the
	8838	@samp{%defines} directive.
	8839	@end table
	8840
	8841	All these files are documented using Doxygen; run @command{doxygen}
	8842	for a complete and accurate documentation.
	8843
	8844	@node C++ Semantic Values
	8845	@subsection C++ Semantic Values
	8846	@c - No objects in unions
	8847	@c - YYSTYPE
	8848	@c - Printer and destructor
	8849
	8850	The @code{%union} directive works as for C, see @ref{Union Decl, ,The
	8851	Collection of Value Types}. In particular it produces a genuine
	8852	@code{union}@footnote{In the future techniques to allow complex types
	8853	within pseudo-unions (similar to Boost variants) might be implemented to
	8854	alleviate these issues.}, which have a few specific features in C++.
	8855	@itemize @minus
	8856	@item
	8857	The type @code{YYSTYPE} is defined but its use is discouraged: rather
	8858	you should refer to the parser's encapsulated type
	8859	@code{yy::parser::semantic_type}.
	8860	@item
	8861	Non POD (Plain Old Data) types cannot be used. C++ forbids any
	8862	instance of classes with constructors in unions: only @emph{pointers}
	8863	to such objects are allowed.
	8864	@end itemize
	8865
	8866	Because objects have to be stored via pointers, memory is not
	8867	reclaimed automatically: using the @code{%destructor} directive is the
	8868	only means to avoid leaks. @xref{Destructor Decl, , Freeing Discarded
	8869	Symbols}.
	8870
	8871
	8872	@node C++ Location Values
	8873	@subsection C++ Location Values
	8874	@c - %locations
	8875	@c - class Position
	8876	@c - class Location
	8877	@c - %define filename_type "const symbol::Symbol"
	8878
	8879	When the directive @code{%locations} is used, the C++ parser supports
	8880	location tracking, see @ref{Tracking Locations}. Two auxiliary classes
	8881	define a @code{position}, a single point in a file, and a @code{location}, a
	8882	range composed of a pair of @code{position}s (possibly spanning several
	8883	files).
	8884
	8885	@deftypemethod {position} {std::string*} file
	8886	The name of the file. It will always be handled as a pointer, the
	8887	parser will never duplicate nor deallocate it. As an experimental
	8888	feature you may change it to @samp{@var{type}*} using @samp{%define
	8889	filename_type "@var{type}"}.
	8890	@end deftypemethod
	8891
	8892	@deftypemethod {position} {unsigned int} line
	8893	The line, starting at 1.
	8894	@end deftypemethod
	8895
	8896	@deftypemethod {position} {unsigned int} lines (int @var{height} = 1)
	8897	Advance by @var{height} lines, resetting the column number.
	8898	@end deftypemethod
	8899
	8900	@deftypemethod {position} {unsigned int} column
	8901	The column, starting at 0.
	8902	@end deftypemethod
	8903
	8904	@deftypemethod {position} {unsigned int} columns (int @var{width} = 1)
	8905	Advance by @var{width} columns, without changing the line number.
	8906	@end deftypemethod
	8907
	8908	@deftypemethod {position} {position&} operator+= (position& @var{pos}, int @var{width})
	8909	@deftypemethodx {position} {position} operator+ (const position& @var{pos}, int @var{width})
	8910	@deftypemethodx {position} {position&} operator-= (const position& @var{pos}, int @var{width})
	8911	@deftypemethodx {position} {position} operator- (position& @var{pos}, int @var{width})
	8912	Various forms of syntactic sugar for @code{columns}.
	8913	@end deftypemethod
	8914
	8915	@deftypemethod {position} {position} operator<< (std::ostream @var{o}, const position& @var{p})
	8916	Report @var{p} on @var{o} like this:
	8917	@samp{@var{file}:@var{line}.@var{column}}, or
	8918	@samp{@var{line}.@var{column}} if @var{file} is null.
	8919	@end deftypemethod
	8920
	8921	@deftypemethod {location} {position} begin
	8922	@deftypemethodx {location} {position} end
	8923	The first, inclusive, position of the range, and the first beyond.
	8924	@end deftypemethod
	8925
	8926	@deftypemethod {location} {unsigned int} columns (int @var{width} = 1)
	8927	@deftypemethodx {location} {unsigned int} lines (int @var{height} = 1)
	8928	Advance the @code{end} position.
	8929	@end deftypemethod
	8930
	8931	@deftypemethod {location} {location} operator+ (const location& @var{begin}, const location& @var{end})
	8932	@deftypemethodx {location} {location} operator+ (const location& @var{begin}, int @var{width})
	8933	@deftypemethodx {location} {location} operator+= (const location& @var{loc}, int @var{width})
	8934	Various forms of syntactic sugar.
	8935	@end deftypemethod
	8936
	8937	@deftypemethod {location} {void} step ()
	8938	Move @code{begin} onto @code{end}.
	8939	@end deftypemethod
	8940
	8941
	8942	@node C++ Parser Interface
	8943	@subsection C++ Parser Interface
	8944	@c - define parser_class_name
	8945	@c - Ctor
	8946	@c - parse, error, set_debug_level, debug_level, set_debug_stream,
	8947	@c debug_stream.
	8948	@c - Reporting errors
	8949
	8950	The output files @file{@var{output}.hh} and @file{@var{output}.cc}
	8951	declare and define the parser class in the namespace @code{yy}. The
	8952	class name defaults to @code{parser}, but may be changed using
	8953	@samp{%define parser_class_name "@var{name}"}. The interface of
	8954	this class is detailed below. It can be extended using the
	8955	@code{%parse-param} feature: its semantics is slightly changed since
	8956	it describes an additional member of the parser class, and an
	8957	additional argument for its constructor.
	8958
	8959	@defcv {Type} {parser} {semantic_type}
	8960	@defcvx {Type} {parser} {location_type}
	8961	The types for semantics value and locations.
	8962	@end defcv
	8963
	8964	@defcv {Type} {parser} {token}
	8965	A structure that contains (only) the @code{yytokentype} enumeration, which
	8966	defines the tokens. To refer to the token @code{FOO},
	8967	use @code{yy::parser::token::FOO}. The scanner can use
	8968	@samp{typedef yy::parser::token token;} to ``import'' the token enumeration
	8969	(@pxref{Calc++ Scanner}).
	8970	@end defcv
	8971
	8972	@deftypemethod {parser} {} parser (@var{type1} @var{arg1}, ...)
	8973	Build a new parser object. There are no arguments by default, unless
	8974	@samp{%parse-param @{@var{type1} @var{arg1}@}} was used.
	8975	@end deftypemethod
	8976
	8977	@deftypemethod {parser} {int} parse ()
	8978	Run the syntactic analysis, and return 0 on success, 1 otherwise.
	8979	@end deftypemethod
	8980
	8981	@deftypemethod {parser} {std::ostream&} debug_stream ()
	8982	@deftypemethodx {parser} {void} set_debug_stream (std::ostream& @var{o})
	8983	Get or set the stream used for tracing the parsing. It defaults to
	8984	@code{std::cerr}.
	8985	@end deftypemethod
	8986
	8987	@deftypemethod {parser} {debug_level_type} debug_level ()
	8988	@deftypemethodx {parser} {void} set_debug_level (debug_level @var{l})
	8989	Get or set the tracing level. Currently its value is either 0, no trace,
	8990	or nonzero, full tracing.
	8991	@end deftypemethod
	8992
	8993	@deftypemethod {parser} {void} error (const location_type& @var{l}, const std::string& @var{m})
	8994	The definition for this member function must be supplied by the user:
	8995	the parser uses it to report a parser error occurring at @var{l},
	8996	described by @var{m}.
	8997	@end deftypemethod
	8998
	8999
	9000	@node C++ Scanner Interface
	9001	@subsection C++ Scanner Interface
	9002	@c - prefix for yylex.
	9003	@c - Pure interface to yylex
	9004	@c - %lex-param
	9005
	9006	The parser invokes the scanner by calling @code{yylex}. Contrary to C
	9007	parsers, C++ parsers are always pure: there is no point in using the
	9008	@code{%define api.pure} directive. Therefore the interface is as follows.
	9009
	9010	@deftypemethod {parser} {int} yylex (semantic_type* @var{yylval}, location_type* @var{yylloc}, @var{type1} @var{arg1}, ...)
	9011	Return the next token. Its type is the return value, its semantic
	9012	value and location being @var{yylval} and @var{yylloc}. Invocations of
	9013	@samp{%lex-param @{@var{type1} @var{arg1}@}} yield additional arguments.
	9014	@end deftypemethod
	9015
	9016
	9017	@node A Complete C++ Example
	9018	@subsection A Complete C++ Example
	9019
	9020	This section demonstrates the use of a C++ parser with a simple but
	9021	complete example. This example should be available on your system,
	9022	ready to compile, in the directory @dfn{../bison/examples/calc++}. It
	9023	focuses on the use of Bison, therefore the design of the various C++
	9024	classes is very naive: no accessors, no encapsulation of members etc.
	9025	We will use a Lex scanner, and more precisely, a Flex scanner, to
	9026	demonstrate the various interaction. A hand written scanner is
	9027	actually easier to interface with.
	9028
	9029	@menu
	9030	* Calc++ --- C++ Calculator:: The specifications
	9031	* Calc++ Parsing Driver:: An active parsing context
	9032	* Calc++ Parser:: A parser class
	9033	* Calc++ Scanner:: A pure C++ Flex scanner
	9034	* Calc++ Top Level:: Conducting the band
	9035	@end menu
	9036
	9037	@node Calc++ --- C++ Calculator
	9038	@subsubsection Calc++ --- C++ Calculator
	9039
	9040	Of course the grammar is dedicated to arithmetics, a single
	9041	expression, possibly preceded by variable assignments. An
	9042	environment containing possibly predefined variables such as
	9043	@code{one} and @code{two}, is exchanged with the parser. An example
	9044	of valid input follows.
	9045
	9046	@example
	9047	three := 3
	9048	seven := one + two * three
	9049	seven * seven
	9050	@end example
	9051
	9052	@node Calc++ Parsing Driver
	9053	@subsubsection Calc++ Parsing Driver
	9054	@c - An env
	9055	@c - A place to store error messages
	9056	@c - A place for the result
	9057
	9058	To support a pure interface with the parser (and the scanner) the
	9059	technique of the ``parsing context'' is convenient: a structure
	9060	containing all the data to exchange. Since, in addition to simply
	9061	launch the parsing, there are several auxiliary tasks to execute (open
	9062	the file for parsing, instantiate the parser etc.), we recommend
	9063	transforming the simple parsing context structure into a fully blown
	9064	@dfn{parsing driver} class.
	9065
	9066	The declaration of this driver class, @file{calc++-driver.hh}, is as
	9067	follows. The first part includes the CPP guard and imports the
	9068	required standard library components, and the declaration of the parser
	9069	class.
	9070
	9071	@comment file: calc++-driver.hh
	9072	@example
	9073	#ifndef CALCXX_DRIVER_HH
	9074	# define CALCXX_DRIVER_HH
	9075	# include <string>
	9076	# include <map>
	9077	# include "calc++-parser.hh"
	9078	@end example
	9079
	9080
	9081	@noindent
	9082	Then comes the declaration of the scanning function. Flex expects
	9083	the signature of @code{yylex} to be defined in the macro
	9084	@code{YY_DECL}, and the C++ parser expects it to be declared. We can
	9085	factor both as follows.
	9086
	9087	@comment file: calc++-driver.hh
	9088	@example
	9089	// Tell Flex the lexer's prototype ...
	9090	# define YY_DECL \
	9091	yy::calcxx_parser::token_type \
	9092	yylex (yy::calcxx_parser::semantic_type* yylval, \
	9093	yy::calcxx_parser::location_type* yylloc, \
	9094	calcxx_driver& driver)
	9095	// ... and declare it for the parser's sake.
	9096	YY_DECL;
	9097	@end example
	9098
	9099	@noindent
	9100	The @code{calcxx_driver} class is then declared with its most obvious
	9101	members.
	9102
	9103	@comment file: calc++-driver.hh
	9104	@example
	9105	// Conducting the whole scanning and parsing of Calc++.
	9106	class calcxx_driver
	9107	@{
	9108	public:
	9109	calcxx_driver ();
	9110	virtual ~calcxx_driver ();
	9111
	9112	std::map<std::string, int> variables;
	9113
	9114	int result;
	9115	@end example
	9116
	9117	@noindent
	9118	To encapsulate the coordination with the Flex scanner, it is useful to
	9119	have two members function to open and close the scanning phase.
	9120
	9121	@comment file: calc++-driver.hh
	9122	@example
	9123	// Handling the scanner.
	9124	void scan_begin ();
	9125	void scan_end ();
	9126	bool trace_scanning;
	9127	@end example
	9128
	9129	@noindent
	9130	Similarly for the parser itself.
	9131
	9132	@comment file: calc++-driver.hh
	9133	@example
	9134	// Run the parser. Return 0 on success.
	9135	int parse (const std::string& f);
	9136	std::string file;
	9137	bool trace_parsing;
	9138	@end example
	9139
	9140	@noindent
	9141	To demonstrate pure handling of parse errors, instead of simply
	9142	dumping them on the standard error output, we will pass them to the
	9143	compiler driver using the following two member functions. Finally, we
	9144	close the class declaration and CPP guard.
	9145
	9146	@comment file: calc++-driver.hh
	9147	@example
	9148	// Error handling.
	9149	void error (const yy::location& l, const std::string& m);
	9150	void error (const std::string& m);
	9151	@};
	9152	#endif // ! CALCXX_DRIVER_HH
	9153	@end example
	9154
	9155	The implementation of the driver is straightforward. The @code{parse}
	9156	member function deserves some attention. The @code{error} functions
	9157	are simple stubs, they should actually register the located error
	9158	messages and set error state.
	9159
	9160	@comment file: calc++-driver.cc
	9161	@example
	9162	#include "calc++-driver.hh"
	9163	#include "calc++-parser.hh"
	9164
	9165	calcxx_driver::calcxx_driver ()
	9166	: trace_scanning (false), trace_parsing (false)
	9167	@{
	9168	variables["one"] = 1;
	9169	variables["two"] = 2;
	9170	@}
	9171
	9172	calcxx_driver::~calcxx_driver ()
	9173	@{
	9174	@}
	9175
	9176	int
	9177	calcxx_driver::parse (const std::string &f)
	9178	@{
	9179	file = f;
	9180	scan_begin ();
	9181	yy::calcxx_parser parser (*this);
	9182	parser.set_debug_level (trace_parsing);
	9183	int res = parser.parse ();
	9184	scan_end ();
	9185	return res;
	9186	@}
	9187
	9188	void
	9189	calcxx_driver::error (const yy::location& l, const std::string& m)
	9190	@{
	9191	std::cerr << l << ": " << m << std::endl;
	9192	@}
	9193
	9194	void
	9195	calcxx_driver::error (const std::string& m)
	9196	@{
	9197	std::cerr << m << std::endl;
	9198	@}
	9199	@end example
	9200
	9201	@node Calc++ Parser
	9202	@subsubsection Calc++ Parser
	9203
	9204	The grammar file @file{calc++-parser.yy} starts by asking for the C++
	9205	deterministic parser skeleton, the creation of the parser header file,
	9206	and specifies the name of the parser class. Because the C++ skeleton
	9207	changed several times, it is safer to require the version you designed
	9208	the grammar for.
	9209
	9210	@comment file: calc++-parser.yy
	9211	@example
	9212	%skeleton "lalr1.cc" /* -- C++ -- */
	9213	%require "@value{VERSION}"
	9214	%defines
	9215	%define parser_class_name "calcxx_parser"
	9216	@end example
	9217
	9218	@noindent
	9219	@findex %code requires
	9220	Then come the declarations/inclusions needed to define the
	9221	@code{%union}. Because the parser uses the parsing driver and
	9222	reciprocally, both cannot include the header of the other. Because the
	9223	driver's header needs detailed knowledge about the parser class (in
	9224	particular its inner types), it is the parser's header which will simply
	9225	use a forward declaration of the driver.
	9226	@xref{%code Summary}.
	9227
	9228	@comment file: calc++-parser.yy
	9229	@example
	9230	%code requires @{
	9231	# include <string>
	9232	class calcxx_driver;
	9233	@}
	9234	@end example
	9235
	9236	@noindent
	9237	The driver is passed by reference to the parser and to the scanner.
	9238	This provides a simple but effective pure interface, not relying on
	9239	global variables.
	9240
	9241	@comment file: calc++-parser.yy
	9242	@example
	9243	// The parsing context.
	9244	%parse-param @{ calcxx_driver& driver @}
	9245	%lex-param @{ calcxx_driver& driver @}
	9246	@end example
	9247
	9248	@noindent
	9249	Then we request the location tracking feature, and initialize the
	9250	first location's file name. Afterward new locations are computed
	9251	relatively to the previous locations: the file name will be
	9252	automatically propagated.
	9253
	9254	@comment file: calc++-parser.yy
	9255	@example
	9256	%locations
	9257	%initial-action
	9258	@{
	9259	// Initialize the initial location.
	9260	@@$.begin.filename = @@$.end.filename = &driver.file;
	9261	@};
	9262	@end example
	9263
	9264	@noindent
	9265	Use the two following directives to enable parser tracing and verbose error
	9266	messages. However, verbose error messages can contain incorrect information
	9267	(@pxref{LAC}).
	9268
	9269	@comment file: calc++-parser.yy
	9270	@example
	9271	%debug
	9272	%error-verbose
	9273	@end example
	9274
	9275	@noindent
	9276	Semantic values cannot use ``real'' objects, but only pointers to
	9277	them.
	9278
	9279	@comment file: calc++-parser.yy
	9280	@example
	9281	// Symbols.
	9282	%union
	9283	@{
	9284	int ival;
	9285	std::string *sval;
	9286	@};
	9287	@end example
	9288
	9289	@noindent
	9290	@findex %code
	9291	The code between @samp{%code @{} and @samp{@}} is output in the
	9292	@file{*.cc} file; it needs detailed knowledge about the driver.
	9293
	9294	@comment file: calc++-parser.yy
	9295	@example
	9296	%code @{
	9297	# include "calc++-driver.hh"
	9298	@}
	9299	@end example
	9300
	9301
	9302	@noindent
	9303	The token numbered as 0 corresponds to end of file; the following line
	9304	allows for nicer error messages referring to ``end of file'' instead
	9305	of ``$end''. Similarly user friendly named are provided for each
	9306	symbol. Note that the tokens names are prefixed by @code{TOKEN_} to
	9307	avoid name clashes.
	9308
	9309	@comment file: calc++-parser.yy
	9310	@example
	9311	%token END 0 "end of file"
	9312	%token ASSIGN ":="
	9313	%token <sval> IDENTIFIER "identifier"
	9314	%token <ival> NUMBER "number"
	9315	%type <ival> exp
	9316	@end example
	9317
	9318	@noindent
	9319	To enable memory deallocation during error recovery, use
	9320	@code{%destructor}.
	9321
	9322	@c FIXME: Document %printer, and mention that it takes a braced-code operand.
	9323	@comment file: calc++-parser.yy
	9324	@example
	9325	%printer @{ debug_stream () << *$$; @} "identifier"
	9326	%destructor @{ delete $$; @} "identifier"
	9327
	9328	%printer @{ debug_stream () << $$; @} <ival>
	9329	@end example
	9330
	9331	@noindent
	9332	The grammar itself is straightforward.
	9333
	9334	@comment file: calc++-parser.yy
	9335	@example
	9336	%%
	9337	%start unit;
	9338	unit: assignments exp @{ driver.result = $2; @};
	9339
	9340	assignments:
	9341	/* Nothing. */ @{@}
	9342	\| assignments assignment @{@};
	9343
	9344	assignment:
	9345	"identifier" ":=" exp
	9346	@{ driver.variables[*$1] = $3; delete $1; @};
	9347
	9348	%left '+' '-';
	9349	%left '*' '/';
	9350	exp: exp '+' exp @{ $$ = $1 + $3; @}
	9351	\| exp '-' exp @{ $$ = $1 - $3; @}
	9352	\| exp '' exp @{ $$ = $1 $3; @}
	9353	\| exp '/' exp @{ $$ = $1 / $3; @}
	9354	\| "identifier" @{ $$ = driver.variables[*$1]; delete $1; @}
	9355	\| "number" @{ $$ = $1; @};
	9356	%%
	9357	@end example
	9358
	9359	@noindent
	9360	Finally the @code{error} member function registers the errors to the
	9361	driver.
	9362
	9363	@comment file: calc++-parser.yy
	9364	@example
	9365	void
	9366	yy::calcxx_parser::error (const yy::calcxx_parser::location_type& l,
	9367	const std::string& m)
	9368	@{
	9369	driver.error (l, m);
	9370	@}
	9371	@end example
	9372
	9373	@node Calc++ Scanner
	9374	@subsubsection Calc++ Scanner
	9375
	9376	The Flex scanner first includes the driver declaration, then the
	9377	parser's to get the set of defined tokens.
	9378
	9379	@comment file: calc++-scanner.ll
	9380	@example
	9381	%@{ /* -- C++ -- */
	9382	# include <cstdlib>
	9383	# include <cerrno>
	9384	# include <climits>
	9385	# include <string>
	9386	# include "calc++-driver.hh"
	9387	# include "calc++-parser.hh"
	9388
	9389	/* Work around an incompatibility in flex (at least versions
	9390	2.5.31 through 2.5.33): it generates code that does
	9391	not conform to C89. See Debian bug 333231
	9392	<http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=333231>. */
	9393	# undef yywrap
	9394	# define yywrap() 1
	9395
	9396	/* By default yylex returns int, we use token_type.
	9397	Unfortunately yyterminate by default returns 0, which is
	9398	not of token_type. */
	9399	#define yyterminate() return token::END
	9400	%@}
	9401	@end example
	9402
	9403	@noindent
	9404	Because there is no @code{#include}-like feature we don't need
	9405	@code{yywrap}, we don't need @code{unput} either, and we parse an
	9406	actual file, this is not an interactive session with the user.
	9407	Finally we enable the scanner tracing features.
	9408
	9409	@comment file: calc++-scanner.ll
	9410	@example
	9411	%option noyywrap nounput batch debug
	9412	@end example
	9413
	9414	@noindent
	9415	Abbreviations allow for more readable rules.
	9416
	9417	@comment file: calc++-scanner.ll
	9418	@example
	9419	id [a-zA-Z][a-zA-Z_0-9]*
	9420	int [0-9]+
	9421	blank [ \t]
	9422	@end example
	9423
	9424	@noindent
	9425	The following paragraph suffices to track locations accurately. Each
	9426	time @code{yylex} is invoked, the begin position is moved onto the end
	9427	position. Then when a pattern is matched, the end position is
	9428	advanced of its width. In case it matched ends of lines, the end
	9429	cursor is adjusted, and each time blanks are matched, the begin cursor
	9430	is moved onto the end cursor to effectively ignore the blanks
	9431	preceding tokens. Comments would be treated equally.
	9432
	9433	@comment file: calc++-scanner.ll
	9434	@example
	9435	@group
	9436	%@{
	9437	# define YY_USER_ACTION yylloc->columns (yyleng);
	9438	%@}
	9439	@end group
	9440	%%
	9441	%@{
	9442	yylloc->step ();
	9443	%@}
	9444	@{blank@}+ yylloc->step ();
	9445	[\n]+ yylloc->lines (yyleng); yylloc->step ();
	9446	@end example
	9447
	9448	@noindent
	9449	The rules are simple, just note the use of the driver to report errors.
	9450	It is convenient to use a typedef to shorten
	9451	@code{yy::calcxx_parser::token::identifier} into
	9452	@code{token::identifier} for instance.
	9453
	9454	@comment file: calc++-scanner.ll
	9455	@example
	9456	%@{
	9457	typedef yy::calcxx_parser::token token;
	9458	%@}
	9459	/* Convert ints to the actual type of tokens. */
	9460	[-+*/] return yy::calcxx_parser::token_type (yytext[0]);
	9461	":=" return token::ASSIGN;
	9462	@{int@} @{
	9463	errno = 0;
	9464	long n = strtol (yytext, NULL, 10);
	9465	if (! (INT_MIN <= n && n <= INT_MAX && errno != ERANGE))
	9466	driver.error (*yylloc, "integer is out of range");
	9467	yylval->ival = n;
	9468	return token::NUMBER;
	9469	@}
	9470	@{id@} yylval->sval = new std::string (yytext); return token::IDENTIFIER;
	9471	. driver.error (*yylloc, "invalid character");
	9472	%%
	9473	@end example
	9474
	9475	@noindent
	9476	Finally, because the scanner related driver's member function depend
	9477	on the scanner's data, it is simpler to implement them in this file.
	9478
	9479	@comment file: calc++-scanner.ll
	9480	@example
	9481	@group
	9482	void
	9483	calcxx_driver::scan_begin ()
	9484	@{
	9485	yy_flex_debug = trace_scanning;
	9486	if (file == "-")
	9487	yyin = stdin;
	9488	else if (!(yyin = fopen (file.c_str (), "r")))
	9489	@{
	9490	error ("cannot open " + file + ": " + strerror(errno));
	9491	exit (EXIT_FAILURE);
	9492	@}
	9493	@}
	9494	@end group
	9495
	9496	@group
	9497	void
	9498	calcxx_driver::scan_end ()
	9499	@{
	9500	fclose (yyin);
	9501	@}
	9502	@end group
	9503	@end example
	9504
	9505	@node Calc++ Top Level
	9506	@subsubsection Calc++ Top Level
	9507
	9508	The top level file, @file{calc++.cc}, poses no problem.
	9509
	9510	@comment file: calc++.cc
	9511	@example
	9512	#include <iostream>
	9513	#include "calc++-driver.hh"
	9514
	9515	@group
	9516	int
	9517	main (int argc, char *argv[])
	9518	@{
	9519	calcxx_driver driver;
	9520	for (++argv; argv[0]; ++argv)
	9521	if (*argv == std::string ("-p"))
	9522	driver.trace_parsing = true;
	9523	else if (*argv == std::string ("-s"))
	9524	driver.trace_scanning = true;
	9525	else if (!driver.parse (*argv))
	9526	std::cout << driver.result << std::endl;
	9527	@}
	9528	@end group
	9529	@end example
	9530
	9531	@node Java Parsers
	9532	@section Java Parsers
	9533
	9534	@menu
	9535	* Java Bison Interface:: Asking for Java parser generation
	9536	* Java Semantic Values:: %type and %token vs. Java
	9537	* Java Location Values:: The position and location classes
	9538	* Java Parser Interface:: Instantiating and running the parser
	9539	* Java Scanner Interface:: Specifying the scanner for the parser
	9540	* Java Action Features:: Special features for use in actions
	9541	* Java Differences:: Differences between C/C++ and Java Grammars
	9542	* Java Declarations Summary:: List of Bison declarations used with Java
	9543	@end menu
	9544
	9545	@node Java Bison Interface
	9546	@subsection Java Bison Interface
	9547	@c - %language "Java"
	9548
	9549	(The current Java interface is experimental and may evolve.
	9550	More user feedback will help to stabilize it.)
	9551
	9552	The Java parser skeletons are selected using the @code{%language "Java"}
	9553	directive or the @option{-L java}/@option{--language=java} option.
	9554
	9555	@c FIXME: Documented bug.
	9556	When generating a Java parser, @code{bison @var{basename}.y} will
	9557	create a single Java source file named @file{@var{basename}.java}
	9558	containing the parser implementation. Using a grammar file without a
	9559	@file{.y} suffix is currently broken. The basename of the parser
	9560	implementation file can be changed by the @code{%file-prefix}
	9561	directive or the @option{-p}/@option{--name-prefix} option. The
	9562	entire parser implementation file name can be changed by the
	9563	@code{%output} directive or the @option{-o}/@option{--output} option.
	9564	The parser implementation file contains a single class for the parser.
	9565
	9566	You can create documentation for generated parsers using Javadoc.
	9567
	9568	Contrary to C parsers, Java parsers do not use global variables; the
	9569	state of the parser is always local to an instance of the parser class.
	9570	Therefore, all Java parsers are ``pure'', and the @code{%pure-parser}
	9571	and @code{%define api.pure} directives does not do anything when used in
	9572	Java.
	9573
	9574	Push parsers are currently unsupported in Java and @code{%define
	9575	api.push-pull} have no effect.
	9576
	9577	GLR parsers are currently unsupported in Java. Do not use the
	9578	@code{glr-parser} directive.
	9579
	9580	No header file can be generated for Java parsers. Do not use the
	9581	@code{%defines} directive or the @option{-d}/@option{--defines} options.
	9582
	9583	@c FIXME: Possible code change.
	9584	Currently, support for debugging and verbose errors are always compiled
	9585	in. Thus the @code{%debug} and @code{%token-table} directives and the
	9586	@option{-t}/@option{--debug} and @option{-k}/@option{--token-table}
	9587	options have no effect. This may change in the future to eliminate
	9588	unused code in the generated parser, so use @code{%debug} and
	9589	@code{%verbose-error} explicitly if needed. Also, in the future the
	9590	@code{%token-table} directive might enable a public interface to
	9591	access the token names and codes.
	9592
	9593	@node Java Semantic Values
	9594	@subsection Java Semantic Values
	9595	@c - No %union, specify type in %type/%token.
	9596	@c - YYSTYPE
	9597	@c - Printer and destructor
	9598
	9599	There is no @code{%union} directive in Java parsers. Instead, the
	9600	semantic values' types (class names) should be specified in the
	9601	@code{%type} or @code{%token} directive:
	9602
	9603	@example
	9604	%type <Expression> expr assignment_expr term factor
	9605	%type <Integer> number
	9606	@end example
	9607
	9608	By default, the semantic stack is declared to have @code{Object} members,
	9609	which means that the class types you specify can be of any class.
	9610	To improve the type safety of the parser, you can declare the common
	9611	superclass of all the semantic values using the @code{%define stype}
	9612	directive. For example, after the following declaration:
	9613
	9614	@example
	9615	%define stype "ASTNode"
	9616	@end example
	9617
	9618	@noindent
	9619	any @code{%type} or @code{%token} specifying a semantic type which
	9620	is not a subclass of ASTNode, will cause a compile-time error.
	9621
	9622	@c FIXME: Documented bug.
	9623	Types used in the directives may be qualified with a package name.
	9624	Primitive data types are accepted for Java version 1.5 or later. Note
	9625	that in this case the autoboxing feature of Java 1.5 will be used.
	9626	Generic types may not be used; this is due to a limitation in the
	9627	implementation of Bison, and may change in future releases.
	9628
	9629	Java parsers do not support @code{%destructor}, since the language
	9630	adopts garbage collection. The parser will try to hold references
	9631	to semantic values for as little time as needed.
	9632
	9633	Java parsers do not support @code{%printer}, as @code{toString()}
	9634	can be used to print the semantic values. This however may change
	9635	(in a backwards-compatible way) in future versions of Bison.
	9636
	9637
	9638	@node Java Location Values
	9639	@subsection Java Location Values
	9640	@c - %locations
	9641	@c - class Position
	9642	@c - class Location
	9643
	9644	When the directive @code{%locations} is used, the Java parser supports
	9645	location tracking, see @ref{Tracking Locations}. An auxiliary user-defined
	9646	class defines a @dfn{position}, a single point in a file; Bison itself
	9647	defines a class representing a @dfn{location}, a range composed of a pair of
	9648	positions (possibly spanning several files). The location class is an inner
	9649	class of the parser; the name is @code{Location} by default, and may also be
	9650	renamed using @code{%define location_type "@var{class-name}"}.
	9651
	9652	The location class treats the position as a completely opaque value.
	9653	By default, the class name is @code{Position}, but this can be changed
	9654	with @code{%define position_type "@var{class-name}"}. This class must
	9655	be supplied by the user.
	9656
	9657
	9658	@deftypeivar {Location} {Position} begin
	9659	@deftypeivarx {Location} {Position} end
	9660	The first, inclusive, position of the range, and the first beyond.
	9661	@end deftypeivar
	9662
	9663	@deftypeop {Constructor} {Location} {} Location (Position @var{loc})
	9664	Create a @code{Location} denoting an empty range located at a given point.
	9665	@end deftypeop
	9666
	9667	@deftypeop {Constructor} {Location} {} Location (Position @var{begin}, Position @var{end})
	9668	Create a @code{Location} from the endpoints of the range.
	9669	@end deftypeop
	9670
	9671	@deftypemethod {Location} {String} toString ()
	9672	Prints the range represented by the location. For this to work
	9673	properly, the position class should override the @code{equals} and
	9674	@code{toString} methods appropriately.
	9675	@end deftypemethod
	9676
	9677
	9678	@node Java Parser Interface
	9679	@subsection Java Parser Interface
	9680	@c - define parser_class_name
	9681	@c - Ctor
	9682	@c - parse, error, set_debug_level, debug_level, set_debug_stream,
	9683	@c debug_stream.
	9684	@c - Reporting errors
	9685
	9686	The name of the generated parser class defaults to @code{YYParser}. The
	9687	@code{YY} prefix may be changed using the @code{%name-prefix} directive
	9688	or the @option{-p}/@option{--name-prefix} option. Alternatively, use
	9689	@code{%define parser_class_name "@var{name}"} to give a custom name to
	9690	the class. The interface of this class is detailed below.
	9691
	9692	By default, the parser class has package visibility. A declaration
	9693	@code{%define public} will change to public visibility. Remember that,
	9694	according to the Java language specification, the name of the @file{.java}
	9695	file should match the name of the class in this case. Similarly, you can
	9696	use @code{abstract}, @code{final} and @code{strictfp} with the
	9697	@code{%define} declaration to add other modifiers to the parser class.
	9698
	9699	The Java package name of the parser class can be specified using the
	9700	@code{%define package} directive. The superclass and the implemented
	9701	interfaces of the parser class can be specified with the @code{%define
	9702	extends} and @code{%define implements} directives.
	9703
	9704	The parser class defines an inner class, @code{Location}, that is used
	9705	for location tracking (see @ref{Java Location Values}), and a inner
	9706	interface, @code{Lexer} (see @ref{Java Scanner Interface}). Other than
	9707	these inner class/interface, and the members described in the interface
	9708	below, all the other members and fields are preceded with a @code{yy} or
	9709	@code{YY} prefix to avoid clashes with user code.
	9710
	9711	@c FIXME: The following constants and variables are still undocumented:
	9712	@c @code{bisonVersion}, @code{bisonSkeleton} and @code{errorVerbose}.
	9713
	9714	The parser class can be extended using the @code{%parse-param}
	9715	directive. Each occurrence of the directive will add a @code{protected
	9716	final} field to the parser class, and an argument to its constructor,
	9717	which initialize them automatically.
	9718
	9719	Token names defined by @code{%token} and the predefined @code{EOF} token
	9720	name are added as constant fields to the parser class.
	9721
	9722	@deftypeop {Constructor} {YYParser} {} YYParser (@var{lex_param}, @dots{}, @var{parse_param}, @dots{})
	9723	Build a new parser object with embedded @code{%code lexer}. There are
	9724	no parameters, unless @code{%parse-param}s and/or @code{%lex-param}s are
	9725	used.
	9726	@end deftypeop
	9727
	9728	@deftypeop {Constructor} {YYParser} {} YYParser (Lexer @var{lexer}, @var{parse_param}, @dots{})
	9729	Build a new parser object using the specified scanner. There are no
	9730	additional parameters unless @code{%parse-param}s are used.
	9731
	9732	If the scanner is defined by @code{%code lexer}, this constructor is
	9733	declared @code{protected} and is called automatically with a scanner
	9734	created with the correct @code{%lex-param}s.
	9735	@end deftypeop
	9736
	9737	@deftypemethod {YYParser} {boolean} parse ()
	9738	Run the syntactic analysis, and return @code{true} on success,
	9739	@code{false} otherwise.
	9740	@end deftypemethod
	9741
	9742	@deftypemethod {YYParser} {boolean} recovering ()
	9743	During the syntactic analysis, return @code{true} if recovering
	9744	from a syntax error.
	9745	@xref{Error Recovery}.
	9746	@end deftypemethod
	9747
	9748	@deftypemethod {YYParser} {java.io.PrintStream} getDebugStream ()
	9749	@deftypemethodx {YYParser} {void} setDebugStream (java.io.printStream @var{o})
	9750	Get or set the stream used for tracing the parsing. It defaults to
	9751	@code{System.err}.
	9752	@end deftypemethod
	9753
	9754	@deftypemethod {YYParser} {int} getDebugLevel ()
	9755	@deftypemethodx {YYParser} {void} setDebugLevel (int @var{l})
	9756	Get or set the tracing level. Currently its value is either 0, no trace,
	9757	or nonzero, full tracing.
	9758	@end deftypemethod
	9759
	9760
	9761	@node Java Scanner Interface
	9762	@subsection Java Scanner Interface
	9763	@c - %code lexer
	9764	@c - %lex-param
	9765	@c - Lexer interface
	9766
	9767	There are two possible ways to interface a Bison-generated Java parser
	9768	with a scanner: the scanner may be defined by @code{%code lexer}, or
	9769	defined elsewhere. In either case, the scanner has to implement the
	9770	@code{Lexer} inner interface of the parser class.
	9771
	9772	In the first case, the body of the scanner class is placed in
	9773	@code{%code lexer} blocks. If you want to pass parameters from the
	9774	parser constructor to the scanner constructor, specify them with
	9775	@code{%lex-param}; they are passed before @code{%parse-param}s to the
	9776	constructor.
	9777
	9778	In the second case, the scanner has to implement the @code{Lexer} interface,
	9779	which is defined within the parser class (e.g., @code{YYParser.Lexer}).
	9780	The constructor of the parser object will then accept an object
	9781	implementing the interface; @code{%lex-param} is not used in this
	9782	case.
	9783
	9784	In both cases, the scanner has to implement the following methods.
	9785
	9786	@deftypemethod {Lexer} {void} yyerror (Location @var{loc}, String @var{msg})
	9787	This method is defined by the user to emit an error message. The first
	9788	parameter is omitted if location tracking is not active. Its type can be
	9789	changed using @code{%define location_type "@var{class-name}".}
	9790	@end deftypemethod
	9791
	9792	@deftypemethod {Lexer} {int} yylex ()
	9793	Return the next token. Its type is the return value, its semantic
	9794	value and location are saved and returned by the their methods in the
	9795	interface.
	9796
	9797	Use @code{%define lex_throws} to specify any uncaught exceptions.
	9798	Default is @code{java.io.IOException}.
	9799	@end deftypemethod
	9800
	9801	@deftypemethod {Lexer} {Position} getStartPos ()
	9802	@deftypemethodx {Lexer} {Position} getEndPos ()
	9803	Return respectively the first position of the last token that
	9804	@code{yylex} returned, and the first position beyond it. These
	9805	methods are not needed unless location tracking is active.
	9806
	9807	The return type can be changed using @code{%define position_type
	9808	"@var{class-name}".}
	9809	@end deftypemethod
	9810
	9811	@deftypemethod {Lexer} {Object} getLVal ()
	9812	Return the semantic value of the last token that yylex returned.
	9813
	9814	The return type can be changed using @code{%define stype
	9815	"@var{class-name}".}
	9816	@end deftypemethod
	9817
	9818
	9819	@node Java Action Features
	9820	@subsection Special Features for Use in Java Actions
	9821
	9822	The following special constructs can be uses in Java actions.
	9823	Other analogous C action features are currently unavailable for Java.
	9824
	9825	Use @code{%define throws} to specify any uncaught exceptions from parser
	9826	actions, and initial actions specified by @code{%initial-action}.
	9827
	9828	@defvar $@var{n}
	9829	The semantic value for the @var{n}th component of the current rule.
	9830	This may not be assigned to.
	9831	@xref{Java Semantic Values}.
	9832	@end defvar
	9833
	9834	@defvar $<@var{typealt}>@var{n}
	9835	Like @code{$@var{n}} but specifies a alternative type @var{typealt}.
	9836	@xref{Java Semantic Values}.
	9837	@end defvar
	9838
	9839	@defvar $$
	9840	The semantic value for the grouping made by the current rule. As a
	9841	value, this is in the base type (@code{Object} or as specified by
	9842	@code{%define stype}) as in not cast to the declared subtype because
	9843	casts are not allowed on the left-hand side of Java assignments.
	9844	Use an explicit Java cast if the correct subtype is needed.
	9845	@xref{Java Semantic Values}.
	9846	@end defvar
	9847
	9848	@defvar $<@var{typealt}>$
	9849	Same as @code{$$} since Java always allow assigning to the base type.
	9850	Perhaps we should use this and @code{$<>$} for the value and @code{$$}
	9851	for setting the value but there is currently no easy way to distinguish
	9852	these constructs.
	9853	@xref{Java Semantic Values}.
	9854	@end defvar
	9855
	9856	@defvar @@@var{n}
	9857	The location information of the @var{n}th component of the current rule.
	9858	This may not be assigned to.
	9859	@xref{Java Location Values}.
	9860	@end defvar
	9861
	9862	@defvar @@$
	9863	The location information of the grouping made by the current rule.
	9864	@xref{Java Location Values}.
	9865	@end defvar
	9866
	9867	@deffn {Statement} {return YYABORT;}
	9868	Return immediately from the parser, indicating failure.
	9869	@xref{Java Parser Interface}.
	9870	@end deffn
	9871
	9872	@deffn {Statement} {return YYACCEPT;}
	9873	Return immediately from the parser, indicating success.
	9874	@xref{Java Parser Interface}.
	9875	@end deffn
	9876
	9877	@deffn {Statement} {return YYERROR;}
	9878	Start error recovery without printing an error message.
	9879	@xref{Error Recovery}.
	9880	@end deffn
	9881
	9882	@deftypefn {Function} {boolean} recovering ()
	9883	Return whether error recovery is being done. In this state, the parser
	9884	reads token until it reaches a known state, and then restarts normal
	9885	operation.
	9886	@xref{Error Recovery}.
	9887	@end deftypefn
	9888
	9889	@deftypefn {Function} {protected void} yyerror (String msg)
	9890	@deftypefnx {Function} {protected void} yyerror (Position pos, String msg)
	9891	@deftypefnx {Function} {protected void} yyerror (Location loc, String msg)
	9892	Print an error message using the @code{yyerror} method of the scanner
	9893	instance in use.
	9894	@end deftypefn
	9895
	9896
	9897	@node Java Differences
	9898	@subsection Differences between C/C++ and Java Grammars
	9899
	9900	The different structure of the Java language forces several differences
	9901	between C/C++ grammars, and grammars designed for Java parsers. This
	9902	section summarizes these differences.
	9903
	9904	@itemize
	9905	@item
	9906	Java lacks a preprocessor, so the @code{YYERROR}, @code{YYACCEPT},
	9907	@code{YYABORT} symbols (@pxref{Table of Symbols}) cannot obviously be
	9908	macros. Instead, they should be preceded by @code{return} when they
	9909	appear in an action. The actual definition of these symbols is
	9910	opaque to the Bison grammar, and it might change in the future. The
	9911	only meaningful operation that you can do, is to return them.
	9912	See @pxref{Java Action Features}.
	9913
	9914	Note that of these three symbols, only @code{YYACCEPT} and
	9915	@code{YYABORT} will cause a return from the @code{yyparse}
	9916	method@footnote{Java parsers include the actions in a separate
	9917	method than @code{yyparse} in order to have an intuitive syntax that
	9918	corresponds to these C macros.}.
	9919
	9920	@item
	9921	Java lacks unions, so @code{%union} has no effect. Instead, semantic
	9922	values have a common base type: @code{Object} or as specified by
	9923	@samp{%define stype}. Angle brackets on @code{%token}, @code{type},
	9924	@code{$@var{n}} and @code{$$} specify subtypes rather than fields of
	9925	an union. The type of @code{$$}, even with angle brackets, is the base
	9926	type since Java casts are not allow on the left-hand side of assignments.
	9927	Also, @code{$@var{n}} and @code{@@@var{n}} are not allowed on the
	9928	left-hand side of assignments. See @pxref{Java Semantic Values} and
	9929	@pxref{Java Action Features}.
	9930
	9931	@item
	9932	The prologue declarations have a different meaning than in C/C++ code.
	9933	@table @asis
	9934	@item @code{%code imports}
	9935	blocks are placed at the beginning of the Java source code. They may
	9936	include copyright notices. For a @code{package} declarations, it is
	9937	suggested to use @code{%define package} instead.
	9938
	9939	@item unqualified @code{%code}
	9940	blocks are placed inside the parser class.
	9941
	9942	@item @code{%code lexer}
	9943	blocks, if specified, should include the implementation of the
	9944	scanner. If there is no such block, the scanner can be any class
	9945	that implements the appropriate interface (see @pxref{Java Scanner
	9946	Interface}).
	9947	@end table
	9948
	9949	Other @code{%code} blocks are not supported in Java parsers.
	9950	In particular, @code{%@{ @dots{} %@}} blocks should not be used
	9951	and may give an error in future versions of Bison.
	9952
	9953	The epilogue has the same meaning as in C/C++ code and it can
	9954	be used to define other classes used by the parser @emph{outside}
	9955	the parser class.
	9956	@end itemize
	9957
	9958
	9959	@node Java Declarations Summary
	9960	@subsection Java Declarations Summary
	9961
	9962	This summary only include declarations specific to Java or have special
	9963	meaning when used in a Java parser.
	9964
	9965	@deffn {Directive} {%language "Java"}
	9966	Generate a Java class for the parser.
	9967	@end deffn
	9968
	9969	@deffn {Directive} %lex-param @{@var{type} @var{name}@}
	9970	A parameter for the lexer class defined by @code{%code lexer}
	9971	@emph{only}, added as parameters to the lexer constructor and the parser
	9972	constructor that @emph{creates} a lexer. Default is none.
	9973	@xref{Java Scanner Interface}.
	9974	@end deffn
	9975
	9976	@deffn {Directive} %name-prefix "@var{prefix}"
	9977	The prefix of the parser class name @code{@var{prefix}Parser} if
	9978	@code{%define parser_class_name} is not used. Default is @code{YY}.
	9979	@xref{Java Bison Interface}.
	9980	@end deffn
	9981
	9982	@deffn {Directive} %parse-param @{@var{type} @var{name}@}
	9983	A parameter for the parser class added as parameters to constructor(s)
	9984	and as fields initialized by the constructor(s). Default is none.
	9985	@xref{Java Parser Interface}.
	9986	@end deffn
	9987
	9988	@deffn {Directive} %token <@var{type}> @var{token} @dots{}
	9989	Declare tokens. Note that the angle brackets enclose a Java @emph{type}.
	9990	@xref{Java Semantic Values}.
	9991	@end deffn
	9992
	9993	@deffn {Directive} %type <@var{type}> @var{nonterminal} @dots{}
	9994	Declare the type of nonterminals. Note that the angle brackets enclose
	9995	a Java @emph{type}.
	9996	@xref{Java Semantic Values}.
	9997	@end deffn
	9998
	9999	@deffn {Directive} %code @{ @var{code} @dots{} @}
	10000	Code appended to the inside of the parser class.
	10001	@xref{Java Differences}.
	10002	@end deffn
	10003
	10004	@deffn {Directive} {%code imports} @{ @var{code} @dots{} @}
	10005	Code inserted just after the @code{package} declaration.
	10006	@xref{Java Differences}.
	10007	@end deffn
	10008
	10009	@deffn {Directive} {%code lexer} @{ @var{code} @dots{} @}
	10010	Code added to the body of a inner lexer class within the parser class.
	10011	@xref{Java Scanner Interface}.
	10012	@end deffn
	10013
	10014	@deffn {Directive} %% @var{code} @dots{}
	10015	Code (after the second @code{%%}) appended to the end of the file,
	10016	@emph{outside} the parser class.
	10017	@xref{Java Differences}.
	10018	@end deffn
	10019
	10020	@deffn {Directive} %@{ @var{code} @dots{} %@}
	10021	Not supported. Use @code{%code import} instead.
	10022	@xref{Java Differences}.
	10023	@end deffn
	10024
	10025	@deffn {Directive} {%define abstract}
	10026	Whether the parser class is declared @code{abstract}. Default is false.
	10027	@xref{Java Bison Interface}.
	10028	@end deffn
	10029
	10030	@deffn {Directive} {%define extends} "@var{superclass}"
	10031	The superclass of the parser class. Default is none.
	10032	@xref{Java Bison Interface}.
	10033	@end deffn
	10034
	10035	@deffn {Directive} {%define final}
	10036	Whether the parser class is declared @code{final}. Default is false.
	10037	@xref{Java Bison Interface}.
	10038	@end deffn
	10039
	10040	@deffn {Directive} {%define implements} "@var{interfaces}"
	10041	The implemented interfaces of the parser class, a comma-separated list.
	10042	Default is none.
	10043	@xref{Java Bison Interface}.
	10044	@end deffn
	10045
	10046	@deffn {Directive} {%define lex_throws} "@var{exceptions}"
	10047	The exceptions thrown by the @code{yylex} method of the lexer, a
	10048	comma-separated list. Default is @code{java.io.IOException}.
	10049	@xref{Java Scanner Interface}.
	10050	@end deffn
	10051
	10052	@deffn {Directive} {%define location_type} "@var{class}"
	10053	The name of the class used for locations (a range between two
	10054	positions). This class is generated as an inner class of the parser
	10055	class by @command{bison}. Default is @code{Location}.
	10056	@xref{Java Location Values}.
	10057	@end deffn
	10058
	10059	@deffn {Directive} {%define package} "@var{package}"
	10060	The package to put the parser class in. Default is none.
	10061	@xref{Java Bison Interface}.
	10062	@end deffn
	10063
	10064	@deffn {Directive} {%define parser_class_name} "@var{name}"
	10065	The name of the parser class. Default is @code{YYParser} or
	10066	@code{@var{name-prefix}Parser}.
	10067	@xref{Java Bison Interface}.
	10068	@end deffn
	10069
	10070	@deffn {Directive} {%define position_type} "@var{class}"
	10071	The name of the class used for positions. This class must be supplied by
	10072	the user. Default is @code{Position}.
	10073	@xref{Java Location Values}.
	10074	@end deffn
	10075
	10076	@deffn {Directive} {%define public}
	10077	Whether the parser class is declared @code{public}. Default is false.
	10078	@xref{Java Bison Interface}.
	10079	@end deffn
	10080
	10081	@deffn {Directive} {%define stype} "@var{class}"
	10082	The base type of semantic values. Default is @code{Object}.
	10083	@xref{Java Semantic Values}.
	10084	@end deffn
	10085
	10086	@deffn {Directive} {%define strictfp}
	10087	Whether the parser class is declared @code{strictfp}. Default is false.
	10088	@xref{Java Bison Interface}.
	10089	@end deffn
	10090
	10091	@deffn {Directive} {%define throws} "@var{exceptions}"
	10092	The exceptions thrown by user-supplied parser actions and
	10093	@code{%initial-action}, a comma-separated list. Default is none.
	10094	@xref{Java Parser Interface}.
	10095	@end deffn
	10096
	10097
	10098	@c ================================================= FAQ
	10099
	10100	@node FAQ
	10101	@chapter Frequently Asked Questions
	10102	@cindex frequently asked questions
	10103	@cindex questions
	10104
	10105	Several questions about Bison come up occasionally. Here some of them
	10106	are addressed.
	10107
	10108	@menu
	10109	* Memory Exhausted:: Breaking the Stack Limits
	10110	* How Can I Reset the Parser:: @code{yyparse} Keeps some State
	10111	* Strings are Destroyed:: @code{yylval} Loses Track of Strings
	10112	* Implementing Gotos/Loops:: Control Flow in the Calculator
	10113	* Multiple start-symbols:: Factoring closely related grammars
	10114	* Secure? Conform?:: Is Bison POSIX safe?
	10115	* I can't build Bison:: Troubleshooting
	10116	* Where can I find help?:: Troubleshouting
	10117	* Bug Reports:: Troublereporting
	10118	* More Languages:: Parsers in C++, Java, and so on
	10119	* Beta Testing:: Experimenting development versions
	10120	* Mailing Lists:: Meeting other Bison users
	10121	@end menu
	10122
	10123	@node Memory Exhausted
	10124	@section Memory Exhausted
	10125
	10126	@quotation
	10127	My parser returns with error with a @samp{memory exhausted}
	10128	message. What can I do?
	10129	@end quotation
	10130
	10131	This question is already addressed elsewhere, @xref{Recursion,
	10132	,Recursive Rules}.
	10133
	10134	@node How Can I Reset the Parser
	10135	@section How Can I Reset the Parser
	10136
	10137	The following phenomenon has several symptoms, resulting in the
	10138	following typical questions:
	10139
	10140	@quotation
	10141	I invoke @code{yyparse} several times, and on correct input it works
	10142	properly; but when a parse error is found, all the other calls fail
	10143	too. How can I reset the error flag of @code{yyparse}?
	10144	@end quotation
	10145
	10146	@noindent
	10147	or
	10148
	10149	@quotation
	10150	My parser includes support for an @samp{#include}-like feature, in
	10151	which case I run @code{yyparse} from @code{yyparse}. This fails
	10152	although I did specify @samp{%define api.pure}.
	10153	@end quotation
	10154
	10155	These problems typically come not from Bison itself, but from
	10156	Lex-generated scanners. Because these scanners use large buffers for
	10157	speed, they might not notice a change of input file. As a
	10158	demonstration, consider the following source file,
	10159	@file{first-line.l}:
	10160
	10161	@example
	10162	@group
	10163	%@{
	10164	#include <stdio.h>
	10165	#include <stdlib.h>
	10166	%@}
	10167	@end group
	10168	%%
	10169	.*\n ECHO; return 1;
	10170	%%
	10171	@group
	10172	int
	10173	yyparse (char const *file)
	10174	@{
	10175	yyin = fopen (file, "r");
	10176	if (!yyin)
	10177	@{
	10178	perror ("fopen");
	10179	exit (EXIT_FAILURE);
	10180	@}
	10181	@end group
	10182	@group
	10183	/* One token only. */
	10184	yylex ();
	10185	if (fclose (yyin) != 0)
	10186	@{
	10187	perror ("fclose");
	10188	exit (EXIT_FAILURE);
	10189	@}
	10190	return 0;
	10191	@}
	10192	@end group
	10193
	10194	@group
	10195	int
	10196	main (void)
	10197	@{
	10198	yyparse ("input");
	10199	yyparse ("input");
	10200	return 0;
	10201	@}
	10202	@end group
	10203	@end example
	10204
	10205	@noindent
	10206	If the file @file{input} contains
	10207
	10208	@example
	10209	input:1: Hello,
	10210	input:2: World!
	10211	@end example
	10212
	10213	@noindent
	10214	then instead of getting the first line twice, you get:
	10215
	10216	@example
	10217	$ @kbd{flex -ofirst-line.c first-line.l}
	10218	$ @kbd{gcc -ofirst-line first-line.c -ll}
	10219	$ @kbd{./first-line}
	10220	input:1: Hello,
	10221	input:2: World!
	10222	@end example
	10223
	10224	Therefore, whenever you change @code{yyin}, you must tell the
	10225	Lex-generated scanner to discard its current buffer and switch to the
	10226	new one. This depends upon your implementation of Lex; see its
	10227	documentation for more. For Flex, it suffices to call
	10228	@samp{YY_FLUSH_BUFFER} after each change to @code{yyin}. If your
	10229	Flex-generated scanner needs to read from several input streams to
	10230	handle features like include files, you might consider using Flex
	10231	functions like @samp{yy_switch_to_buffer} that manipulate multiple
	10232	input buffers.
	10233
	10234	If your Flex-generated scanner uses start conditions (@pxref{Start
	10235	conditions, , Start conditions, flex, The Flex Manual}), you might
	10236	also want to reset the scanner's state, i.e., go back to the initial
	10237	start condition, through a call to @samp{BEGIN (0)}.
	10238
	10239	@node Strings are Destroyed
	10240	@section Strings are Destroyed
	10241
	10242	@quotation
	10243	My parser seems to destroy old strings, or maybe it loses track of
	10244	them. Instead of reporting @samp{"foo", "bar"}, it reports
	10245	@samp{"bar", "bar"}, or even @samp{"foo\nbar", "bar"}.
	10246	@end quotation
	10247
	10248	This error is probably the single most frequent ``bug report'' sent to
	10249	Bison lists, but is only concerned with a misunderstanding of the role
	10250	of the scanner. Consider the following Lex code:
	10251
	10252	@example
	10253	@group
	10254	%@{
	10255	#include <stdio.h>
	10256	char *yylval = NULL;
	10257	%@}
	10258	@end group
	10259	@group
	10260	%%
	10261	.* yylval = yytext; return 1;
	10262	\n /* IGNORE */
	10263	%%
	10264	@end group
	10265	@group
	10266	int
	10267	main ()
	10268	@{
	10269	/* Similar to using $1, $2 in a Bison action. */
	10270	char *fst = (yylex (), yylval);
	10271	char *snd = (yylex (), yylval);
	10272	printf ("\"%s\", \"%s\"\n", fst, snd);
	10273	return 0;
	10274	@}
	10275	@end group
	10276	@end example
	10277
	10278	If you compile and run this code, you get:
	10279
	10280	@example
	10281	$ @kbd{flex -osplit-lines.c split-lines.l}
	10282	$ @kbd{gcc -osplit-lines split-lines.c -ll}
	10283	$ @kbd{printf 'one\ntwo\n' \| ./split-lines}
	10284	"one
	10285	two", "two"
	10286	@end example
	10287
	10288	@noindent
	10289	this is because @code{yytext} is a buffer provided for @emph{reading}
	10290	in the action, but if you want to keep it, you have to duplicate it
	10291	(e.g., using @code{strdup}). Note that the output may depend on how
	10292	your implementation of Lex handles @code{yytext}. For instance, when
	10293	given the Lex compatibility option @option{-l} (which triggers the
	10294	option @samp{%array}) Flex generates a different behavior:
	10295
	10296	@example
	10297	$ @kbd{flex -l -osplit-lines.c split-lines.l}
	10298	$ @kbd{gcc -osplit-lines split-lines.c -ll}
	10299	$ @kbd{printf 'one\ntwo\n' \| ./split-lines}
	10300	"two", "two"
	10301	@end example
	10302
	10303
	10304	@node Implementing Gotos/Loops
	10305	@section Implementing Gotos/Loops
	10306
	10307	@quotation
	10308	My simple calculator supports variables, assignments, and functions,
	10309	but how can I implement gotos, or loops?
	10310	@end quotation
	10311
	10312	Although very pedagogical, the examples included in the document blur
	10313	the distinction to make between the parser---whose job is to recover
	10314	the structure of a text and to transmit it to subsequent modules of
	10315	the program---and the processing (such as the execution) of this
	10316	structure. This works well with so called straight line programs,
	10317	i.e., precisely those that have a straightforward execution model:
	10318	execute simple instructions one after the others.
	10319
	10320	@cindex abstract syntax tree
	10321	@cindex AST
	10322	If you want a richer model, you will probably need to use the parser
	10323	to construct a tree that does represent the structure it has
	10324	recovered; this tree is usually called the @dfn{abstract syntax tree},
	10325	or @dfn{AST} for short. Then, walking through this tree,
	10326	traversing it in various ways, will enable treatments such as its
	10327	execution or its translation, which will result in an interpreter or a
	10328	compiler.
	10329
	10330	This topic is way beyond the scope of this manual, and the reader is
	10331	invited to consult the dedicated literature.
	10332
	10333
	10334	@node Multiple start-symbols
	10335	@section Multiple start-symbols
	10336
	10337	@quotation
	10338	I have several closely related grammars, and I would like to share their
	10339	implementations. In fact, I could use a single grammar but with
	10340	multiple entry points.
	10341	@end quotation
	10342
	10343	Bison does not support multiple start-symbols, but there is a very
	10344	simple means to simulate them. If @code{foo} and @code{bar} are the two
	10345	pseudo start-symbols, then introduce two new tokens, say
	10346	@code{START_FOO} and @code{START_BAR}, and use them as switches from the
	10347	real start-symbol:
	10348
	10349	@example
	10350	%token START_FOO START_BAR;
	10351	%start start;
	10352	start:
	10353	START_FOO foo
	10354	\| START_BAR bar;
	10355	@end example
	10356
	10357	These tokens prevents the introduction of new conflicts. As far as the
	10358	parser goes, that is all that is needed.
	10359
	10360	Now the difficult part is ensuring that the scanner will send these
	10361	tokens first. If your scanner is hand-written, that should be
	10362	straightforward. If your scanner is generated by Lex, them there is
	10363	simple means to do it: recall that anything between @samp{%@{ ... %@}}
	10364	after the first @code{%%} is copied verbatim in the top of the generated
	10365	@code{yylex} function. Make sure a variable @code{start_token} is
	10366	available in the scanner (e.g., a global variable or using
	10367	@code{%lex-param} etc.), and use the following:
	10368
	10369	@example
	10370	/* @r{Prologue.} */
	10371	%%
	10372	%@{
	10373	if (start_token)
	10374	@{
	10375	int t = start_token;
	10376	start_token = 0;
	10377	return t;
	10378	@}
	10379	%@}
	10380	/* @r{The rules.} */
	10381	@end example
	10382
	10383
	10384	@node Secure? Conform?
	10385	@section Secure? Conform?
	10386
	10387	@quotation
	10388	Is Bison secure? Does it conform to POSIX?
	10389	@end quotation
	10390
	10391	If you're looking for a guarantee or certification, we don't provide it.
	10392	However, Bison is intended to be a reliable program that conforms to the
	10393	POSIX specification for Yacc. If you run into problems,
	10394	please send us a bug report.
	10395
	10396	@node I can't build Bison
	10397	@section I can't build Bison
	10398
	10399	@quotation
	10400	I can't build Bison because @command{make} complains that
	10401	@code{msgfmt} is not found.
	10402	What should I do?
	10403	@end quotation
	10404
	10405	Like most GNU packages with internationalization support, that feature
	10406	is turned on by default. If you have problems building in the @file{po}
	10407	subdirectory, it indicates that your system's internationalization
	10408	support is lacking. You can re-configure Bison with
	10409	@option{--disable-nls} to turn off this support, or you can install GNU
	10410	gettext from @url{ftp://ftp.gnu.org/gnu/gettext/} and re-configure
	10411	Bison. See the file @file{ABOUT-NLS} for more information.
	10412
	10413
	10414	@node Where can I find help?
	10415	@section Where can I find help?
	10416
	10417	@quotation
	10418	I'm having trouble using Bison. Where can I find help?
	10419	@end quotation
	10420
	10421	First, read this fine manual. Beyond that, you can send mail to
	10422	@email{help-bison@@gnu.org}. This mailing list is intended to be
	10423	populated with people who are willing to answer questions about using
	10424	and installing Bison. Please keep in mind that (most of) the people on
	10425	the list have aspects of their lives which are not related to Bison (!),
	10426	so you may not receive an answer to your question right away. This can
	10427	be frustrating, but please try not to honk them off; remember that any
	10428	help they provide is purely voluntary and out of the kindness of their
	10429	hearts.
	10430
	10431	@node Bug Reports
	10432	@section Bug Reports
	10433
	10434	@quotation
	10435	I found a bug. What should I include in the bug report?
	10436	@end quotation
	10437
	10438	Before you send a bug report, make sure you are using the latest
	10439	version. Check @url{ftp://ftp.gnu.org/pub/gnu/bison/} or one of its
	10440	mirrors. Be sure to include the version number in your bug report. If
	10441	the bug is present in the latest version but not in a previous version,
	10442	try to determine the most recent version which did not contain the bug.
	10443
	10444	If the bug is parser-related, you should include the smallest grammar
	10445	you can which demonstrates the bug. The grammar file should also be
	10446	complete (i.e., I should be able to run it through Bison without having
	10447	to edit or add anything). The smaller and simpler the grammar, the
	10448	easier it will be to fix the bug.
	10449
	10450	Include information about your compilation environment, including your
	10451	operating system's name and version and your compiler's name and
	10452	version. If you have trouble compiling, you should also include a
	10453	transcript of the build session, starting with the invocation of
	10454	`configure'. Depending on the nature of the bug, you may be asked to
	10455	send additional files as well (such as `config.h' or `config.cache').
	10456
	10457	Patches are most welcome, but not required. That is, do not hesitate to
	10458	send a bug report just because you cannot provide a fix.
	10459
	10460	Send bug reports to @email{bug-bison@@gnu.org}.
	10461
	10462	@node More Languages
	10463	@section More Languages
	10464
	10465	@quotation
	10466	Will Bison ever have C++ and Java support? How about @var{insert your
	10467	favorite language here}?
	10468	@end quotation
	10469
	10470	C++ and Java support is there now, and is documented. We'd love to add other
	10471	languages; contributions are welcome.
	10472
	10473	@node Beta Testing
	10474	@section Beta Testing
	10475
	10476	@quotation
	10477	What is involved in being a beta tester?
	10478	@end quotation
	10479
	10480	It's not terribly involved. Basically, you would download a test
	10481	release, compile it, and use it to build and run a parser or two. After
	10482	that, you would submit either a bug report or a message saying that
	10483	everything is okay. It is important to report successes as well as
	10484	failures because test releases eventually become mainstream releases,
	10485	but only if they are adequately tested. If no one tests, development is
	10486	essentially halted.
	10487
	10488	Beta testers are particularly needed for operating systems to which the
	10489	developers do not have easy access. They currently have easy access to
	10490	recent GNU/Linux and Solaris versions. Reports about other operating
	10491	systems are especially welcome.
	10492
	10493	@node Mailing Lists
	10494	@section Mailing Lists
	10495
	10496	@quotation
	10497	How do I join the help-bison and bug-bison mailing lists?
	10498	@end quotation
	10499
	10500	See @url{http://lists.gnu.org/}.
	10501
	10502	@c ================================================= Table of Symbols
	10503
	10504	@node Table of Symbols
	10505	@appendix Bison Symbols
	10506	@cindex Bison symbols, table of
	10507	@cindex symbols in Bison, table of
	10508
	10509	@deffn {Variable} @@$
	10510	In an action, the location of the left-hand side of the rule.
	10511	@xref{Tracking Locations}.
	10512	@end deffn
	10513
	10514	@deffn {Variable} @@@var{n}
	10515	In an action, the location of the @var{n}-th symbol of the right-hand side
	10516	of the rule. @xref{Tracking Locations}.
	10517	@end deffn
	10518
	10519	@deffn {Variable} @@@var{name}
	10520	In an action, the location of a symbol addressed by name. @xref{Tracking
	10521	Locations}.
	10522	@end deffn
	10523
	10524	@deffn {Variable} @@[@var{name}]
	10525	In an action, the location of a symbol addressed by name. @xref{Tracking
	10526	Locations}.
	10527	@end deffn
	10528
	10529	@deffn {Variable} $$
	10530	In an action, the semantic value of the left-hand side of the rule.
	10531	@xref{Actions}.
	10532	@end deffn
	10533
	10534	@deffn {Variable} $@var{n}
	10535	In an action, the semantic value of the @var{n}-th symbol of the
	10536	right-hand side of the rule. @xref{Actions}.
	10537	@end deffn
	10538
	10539	@deffn {Variable} $@var{name}
	10540	In an action, the semantic value of a symbol addressed by name.
	10541	@xref{Actions}.
	10542	@end deffn
	10543
	10544	@deffn {Variable} $[@var{name}]
	10545	In an action, the semantic value of a symbol addressed by name.
	10546	@xref{Actions}.
	10547	@end deffn
	10548
	10549	@deffn {Delimiter} %%
	10550	Delimiter used to separate the grammar rule section from the
	10551	Bison declarations section or the epilogue.
	10552	@xref{Grammar Layout, ,The Overall Layout of a Bison Grammar}.
	10553	@end deffn
	10554
	10555	@c Don't insert spaces, or check the DVI output.
	10556	@deffn {Delimiter} %@{@var{code}%@}
	10557	All code listed between @samp{%@{} and @samp{%@}} is copied verbatim
	10558	to the parser implementation file. Such code forms the prologue of
	10559	the grammar file. @xref{Grammar Outline, ,Outline of a Bison
	10560	Grammar}.
	10561	@end deffn
	10562
	10563	@deffn {Construct} /@dots{}/
	10564	Comment delimiters, as in C.
	10565	@end deffn
	10566
	10567	@deffn {Delimiter} :
	10568	Separates a rule's result from its components. @xref{Rules, ,Syntax of
	10569	Grammar Rules}.
	10570	@end deffn
	10571
	10572	@deffn {Delimiter} ;
	10573	Terminates a rule. @xref{Rules, ,Syntax of Grammar Rules}.
	10574	@end deffn
	10575
	10576	@deffn {Delimiter} \|
	10577	Separates alternate rules for the same result nonterminal.
	10578	@xref{Rules, ,Syntax of Grammar Rules}.
	10579	@end deffn
	10580
	10581	@deffn {Directive} <*>
	10582	Used to define a default tagged @code{%destructor} or default tagged
	10583	@code{%printer}.
	10584
	10585	This feature is experimental.
	10586	More user feedback will help to determine whether it should become a permanent
	10587	feature.
	10588
	10589	@xref{Destructor Decl, , Freeing Discarded Symbols}.
	10590	@end deffn
	10591
	10592	@deffn {Directive} <>
	10593	Used to define a default tagless @code{%destructor} or default tagless
	10594	@code{%printer}.
	10595
	10596	This feature is experimental.
	10597	More user feedback will help to determine whether it should become a permanent
	10598	feature.
	10599
	10600	@xref{Destructor Decl, , Freeing Discarded Symbols}.
	10601	@end deffn
	10602
	10603	@deffn {Symbol} $accept
	10604	The predefined nonterminal whose only rule is @samp{$accept: @var{start}
	10605	$end}, where @var{start} is the start symbol. @xref{Start Decl, , The
	10606	Start-Symbol}. It cannot be used in the grammar.
	10607	@end deffn
	10608
	10609	@deffn {Directive} %code @{@var{code}@}
	10610	@deffnx {Directive} %code @var{qualifier} @{@var{code}@}
	10611	Insert @var{code} verbatim into the output parser source at the
	10612	default location or at the location specified by @var{qualifier}.
	10613	@xref{%code Summary}.
	10614	@end deffn
	10615
	10616	@deffn {Directive} %debug
	10617	Equip the parser for debugging. @xref{Decl Summary}.
	10618	@end deffn
	10619
	10620	@ifset defaultprec
	10621	@deffn {Directive} %default-prec
	10622	Assign a precedence to rules that lack an explicit @samp{%prec}
	10623	modifier. @xref{Contextual Precedence, ,Context-Dependent
	10624	Precedence}.
	10625	@end deffn
	10626	@end ifset
	10627
	10628	@deffn {Directive} %define @var{variable}
	10629	@deffnx {Directive} %define @var{variable} @var{value}
	10630	@deffnx {Directive} %define @var{variable} "@var{value}"
	10631	Define a variable to adjust Bison's behavior. @xref{%define Summary}.
	10632	@end deffn
	10633
	10634	@deffn {Directive} %defines
	10635	Bison declaration to create a parser header file, which is usually
	10636	meant for the scanner. @xref{Decl Summary}.
	10637	@end deffn
	10638
	10639	@deffn {Directive} %defines @var{defines-file}
	10640	Same as above, but save in the file @var{defines-file}.
	10641	@xref{Decl Summary}.
	10642	@end deffn
	10643
	10644	@deffn {Directive} %destructor
	10645	Specify how the parser should reclaim the memory associated to
	10646	discarded symbols. @xref{Destructor Decl, , Freeing Discarded Symbols}.
	10647	@end deffn
	10648
	10649	@deffn {Directive} %dprec
	10650	Bison declaration to assign a precedence to a rule that is used at parse
	10651	time to resolve reduce/reduce conflicts. @xref{GLR Parsers, ,Writing
	10652	GLR Parsers}.
	10653	@end deffn
	10654
	10655	@deffn {Symbol} $end
	10656	The predefined token marking the end of the token stream. It cannot be
	10657	used in the grammar.
	10658	@end deffn
	10659
	10660	@deffn {Symbol} error
	10661	A token name reserved for error recovery. This token may be used in
	10662	grammar rules so as to allow the Bison parser to recognize an error in
	10663	the grammar without halting the process. In effect, a sentence
	10664	containing an error may be recognized as valid. On a syntax error, the
	10665	token @code{error} becomes the current lookahead token. Actions
	10666	corresponding to @code{error} are then executed, and the lookahead
	10667	token is reset to the token that originally caused the violation.
	10668	@xref{Error Recovery}.
	10669	@end deffn
	10670
	10671	@deffn {Directive} %error-verbose
	10672	Bison declaration to request verbose, specific error message strings
	10673	when @code{yyerror} is called. @xref{Error Reporting}.
	10674	@end deffn
	10675
	10676	@deffn {Directive} %file-prefix "@var{prefix}"
	10677	Bison declaration to set the prefix of the output files. @xref{Decl
	10678	Summary}.
	10679	@end deffn
	10680
	10681	@deffn {Directive} %glr-parser
	10682	Bison declaration to produce a GLR parser. @xref{GLR
	10683	Parsers, ,Writing GLR Parsers}.
	10684	@end deffn
	10685
	10686	@deffn {Directive} %initial-action
	10687	Run user code before parsing. @xref{Initial Action Decl, , Performing Actions before Parsing}.
	10688	@end deffn
	10689
	10690	@deffn {Directive} %language
	10691	Specify the programming language for the generated parser.
	10692	@xref{Decl Summary}.
	10693	@end deffn
	10694
	10695	@deffn {Directive} %left
	10696	Bison declaration to assign left associativity to token(s).
	10697	@xref{Precedence Decl, ,Operator Precedence}.
	10698	@end deffn
	10699
	10700	@deffn {Directive} %lex-param @{@var{argument-declaration}@}
	10701	Bison declaration to specifying an additional parameter that
	10702	@code{yylex} should accept. @xref{Pure Calling,, Calling Conventions
	10703	for Pure Parsers}.
	10704	@end deffn
	10705
	10706	@deffn {Directive} %merge
	10707	Bison declaration to assign a merging function to a rule. If there is a
	10708	reduce/reduce conflict with a rule having the same merging function, the
	10709	function is applied to the two semantic values to get a single result.
	10710	@xref{GLR Parsers, ,Writing GLR Parsers}.
	10711	@end deffn
	10712
	10713	@deffn {Directive} %name-prefix "@var{prefix}"
	10714	Bison declaration to rename the external symbols. @xref{Decl Summary}.
	10715	@end deffn
	10716
	10717	@ifset defaultprec
	10718	@deffn {Directive} %no-default-prec
	10719	Do not assign a precedence to rules that lack an explicit @samp{%prec}
	10720	modifier. @xref{Contextual Precedence, ,Context-Dependent
	10721	Precedence}.
	10722	@end deffn
	10723	@end ifset
	10724
	10725	@deffn {Directive} %no-lines
	10726	Bison declaration to avoid generating @code{#line} directives in the
	10727	parser implementation file. @xref{Decl Summary}.
	10728	@end deffn
	10729
	10730	@deffn {Directive} %nonassoc
	10731	Bison declaration to assign nonassociativity to token(s).
	10732	@xref{Precedence Decl, ,Operator Precedence}.
	10733	@end deffn
	10734
	10735	@deffn {Directive} %output "@var{file}"
	10736	Bison declaration to set the name of the parser implementation file.
	10737	@xref{Decl Summary}.
	10738	@end deffn
	10739
	10740	@deffn {Directive} %parse-param @{@var{argument-declaration}@}
	10741	Bison declaration to specifying an additional parameter that
	10742	@code{yyparse} should accept. @xref{Parser Function,, The Parser
	10743	Function @code{yyparse}}.
	10744	@end deffn
	10745
	10746	@deffn {Directive} %prec
	10747	Bison declaration to assign a precedence to a specific rule.
	10748	@xref{Contextual Precedence, ,Context-Dependent Precedence}.
	10749	@end deffn
	10750
	10751	@deffn {Directive} %pure-parser
	10752	Deprecated version of @code{%define api.pure} (@pxref{%define
	10753	Summary,,api.pure}), for which Bison is more careful to warn about
	10754	unreasonable usage.
	10755	@end deffn
	10756
	10757	@deffn {Directive} %require "@var{version}"
	10758	Require version @var{version} or higher of Bison. @xref{Require Decl, ,
	10759	Require a Version of Bison}.
	10760	@end deffn
	10761
	10762	@deffn {Directive} %right
	10763	Bison declaration to assign right associativity to token(s).
	10764	@xref{Precedence Decl, ,Operator Precedence}.
	10765	@end deffn
	10766
	10767	@deffn {Directive} %skeleton
	10768	Specify the skeleton to use; usually for development.
	10769	@xref{Decl Summary}.
	10770	@end deffn
	10771
	10772	@deffn {Directive} %start
	10773	Bison declaration to specify the start symbol. @xref{Start Decl, ,The
	10774	Start-Symbol}.
	10775	@end deffn
	10776
	10777	@deffn {Directive} %token
	10778	Bison declaration to declare token(s) without specifying precedence.
	10779	@xref{Token Decl, ,Token Type Names}.
	10780	@end deffn
	10781
	10782	@deffn {Directive} %token-table
	10783	Bison declaration to include a token name table in the parser
	10784	implementation file. @xref{Decl Summary}.
	10785	@end deffn
	10786
	10787	@deffn {Directive} %type
	10788	Bison declaration to declare nonterminals. @xref{Type Decl,
	10789	,Nonterminal Symbols}.
	10790	@end deffn
	10791
	10792	@deffn {Symbol} $undefined
	10793	The predefined token onto which all undefined values returned by
	10794	@code{yylex} are mapped. It cannot be used in the grammar, rather, use
	10795	@code{error}.
	10796	@end deffn
	10797
	10798	@deffn {Directive} %union
	10799	Bison declaration to specify several possible data types for semantic
	10800	values. @xref{Union Decl, ,The Collection of Value Types}.
	10801	@end deffn
	10802
	10803	@deffn {Macro} YYABORT
	10804	Macro to pretend that an unrecoverable syntax error has occurred, by
	10805	making @code{yyparse} return 1 immediately. The error reporting
	10806	function @code{yyerror} is not called. @xref{Parser Function, ,The
	10807	Parser Function @code{yyparse}}.
	10808
	10809	For Java parsers, this functionality is invoked using @code{return YYABORT;}
	10810	instead.
	10811	@end deffn
	10812
	10813	@deffn {Macro} YYACCEPT
	10814	Macro to pretend that a complete utterance of the language has been
	10815	read, by making @code{yyparse} return 0 immediately.
	10816	@xref{Parser Function, ,The Parser Function @code{yyparse}}.
	10817
	10818	For Java parsers, this functionality is invoked using @code{return YYACCEPT;}
	10819	instead.
	10820	@end deffn
	10821
	10822	@deffn {Macro} YYBACKUP
	10823	Macro to discard a value from the parser stack and fake a lookahead
	10824	token. @xref{Action Features, ,Special Features for Use in Actions}.
	10825	@end deffn
	10826
	10827	@deffn {Variable} yychar
	10828	External integer variable that contains the integer value of the
	10829	lookahead token. (In a pure parser, it is a local variable within
	10830	@code{yyparse}.) Error-recovery rule actions may examine this variable.
	10831	@xref{Action Features, ,Special Features for Use in Actions}.
	10832	@end deffn
	10833
	10834	@deffn {Variable} yyclearin
	10835	Macro used in error-recovery rule actions. It clears the previous
	10836	lookahead token. @xref{Error Recovery}.
	10837	@end deffn
	10838
	10839	@deffn {Macro} YYDEBUG
	10840	Macro to define to equip the parser with tracing code. @xref{Tracing,
	10841	,Tracing Your Parser}.
	10842	@end deffn
	10843
	10844	@deffn {Variable} yydebug
	10845	External integer variable set to zero by default. If @code{yydebug}
	10846	is given a nonzero value, the parser will output information on input
	10847	symbols and parser action. @xref{Tracing, ,Tracing Your Parser}.
	10848	@end deffn
	10849
	10850	@deffn {Macro} yyerrok
	10851	Macro to cause parser to recover immediately to its normal mode
	10852	after a syntax error. @xref{Error Recovery}.
	10853	@end deffn
	10854
	10855	@deffn {Macro} YYERROR
	10856	Macro to pretend that a syntax error has just been detected: call
	10857	@code{yyerror} and then perform normal error recovery if possible
	10858	(@pxref{Error Recovery}), or (if recovery is impossible) make
	10859	@code{yyparse} return 1. @xref{Error Recovery}.
	10860
	10861	For Java parsers, this functionality is invoked using @code{return YYERROR;}
	10862	instead.
	10863	@end deffn
	10864
	10865	@deffn {Function} yyerror
	10866	User-supplied function to be called by @code{yyparse} on error.
	10867	@xref{Error Reporting, ,The Error
	10868	Reporting Function @code{yyerror}}.
	10869	@end deffn
	10870
	10871	@deffn {Macro} YYERROR_VERBOSE
	10872	An obsolete macro that you define with @code{#define} in the prologue
	10873	to request verbose, specific error message strings
	10874	when @code{yyerror} is called. It doesn't matter what definition you
	10875	use for @code{YYERROR_VERBOSE}, just whether you define it. Using
	10876	@code{%error-verbose} is preferred. @xref{Error Reporting}.
	10877	@end deffn
	10878
	10879	@deffn {Macro} YYINITDEPTH
	10880	Macro for specifying the initial size of the parser stack.
	10881	@xref{Memory Management}.
	10882	@end deffn
	10883
	10884	@deffn {Function} yylex
	10885	User-supplied lexical analyzer function, called with no arguments to get
	10886	the next token. @xref{Lexical, ,The Lexical Analyzer Function
	10887	@code{yylex}}.
	10888	@end deffn
	10889
	10890	@deffn {Macro} YYLEX_PARAM
	10891	An obsolete macro for specifying an extra argument (or list of extra
	10892	arguments) for @code{yyparse} to pass to @code{yylex}. The use of this
	10893	macro is deprecated, and is supported only for Yacc like parsers.
	10894	@xref{Pure Calling,, Calling Conventions for Pure Parsers}.
	10895	@end deffn
	10896
	10897	@deffn {Variable} yylloc
	10898	External variable in which @code{yylex} should place the line and column
	10899	numbers associated with a token. (In a pure parser, it is a local
	10900	variable within @code{yyparse}, and its address is passed to
	10901	@code{yylex}.)
	10902	You can ignore this variable if you don't use the @samp{@@} feature in the
	10903	grammar actions.
	10904	@xref{Token Locations, ,Textual Locations of Tokens}.
	10905	In semantic actions, it stores the location of the lookahead token.
	10906	@xref{Actions and Locations, ,Actions and Locations}.
	10907	@end deffn
	10908
	10909	@deffn {Type} YYLTYPE
	10910	Data type of @code{yylloc}; by default, a structure with four
	10911	members. @xref{Location Type, , Data Types of Locations}.
	10912	@end deffn
	10913
	10914	@deffn {Variable} yylval
	10915	External variable in which @code{yylex} should place the semantic
	10916	value associated with a token. (In a pure parser, it is a local
	10917	variable within @code{yyparse}, and its address is passed to
	10918	@code{yylex}.)
	10919	@xref{Token Values, ,Semantic Values of Tokens}.
	10920	In semantic actions, it stores the semantic value of the lookahead token.
	10921	@xref{Actions, ,Actions}.
	10922	@end deffn
	10923
	10924	@deffn {Macro} YYMAXDEPTH
	10925	Macro for specifying the maximum size of the parser stack. @xref{Memory
	10926	Management}.
	10927	@end deffn
	10928
	10929	@deffn {Variable} yynerrs
	10930	Global variable which Bison increments each time it reports a syntax error.
	10931	(In a pure parser, it is a local variable within @code{yyparse}. In a
	10932	pure push parser, it is a member of yypstate.)
	10933	@xref{Error Reporting, ,The Error Reporting Function @code{yyerror}}.
	10934	@end deffn
	10935
	10936	@deffn {Function} yyparse
	10937	The parser function produced by Bison; call this function to start
	10938	parsing. @xref{Parser Function, ,The Parser Function @code{yyparse}}.
	10939	@end deffn
	10940
	10941	@deffn {Function} yypstate_delete
	10942	The function to delete a parser instance, produced by Bison in push mode;
	10943	call this function to delete the memory associated with a parser.
	10944	@xref{Parser Delete Function, ,The Parser Delete Function
	10945	@code{yypstate_delete}}.
	10946	(The current push parsing interface is experimental and may evolve.
	10947	More user feedback will help to stabilize it.)
	10948	@end deffn
	10949
	10950	@deffn {Function} yypstate_new
	10951	The function to create a parser instance, produced by Bison in push mode;
	10952	call this function to create a new parser.
	10953	@xref{Parser Create Function, ,The Parser Create Function
	10954	@code{yypstate_new}}.
	10955	(The current push parsing interface is experimental and may evolve.
	10956	More user feedback will help to stabilize it.)
	10957	@end deffn
	10958
	10959	@deffn {Function} yypull_parse
	10960	The parser function produced by Bison in push mode; call this function to
	10961	parse the rest of the input stream.
	10962	@xref{Pull Parser Function, ,The Pull Parser Function
	10963	@code{yypull_parse}}.
	10964	(The current push parsing interface is experimental and may evolve.
	10965	More user feedback will help to stabilize it.)
	10966	@end deffn
	10967
	10968	@deffn {Function} yypush_parse
	10969	The parser function produced by Bison in push mode; call this function to
	10970	parse a single token. @xref{Push Parser Function, ,The Push Parser Function
	10971	@code{yypush_parse}}.
	10972	(The current push parsing interface is experimental and may evolve.
	10973	More user feedback will help to stabilize it.)
	10974	@end deffn
	10975
	10976	@deffn {Macro} YYPARSE_PARAM
	10977	An obsolete macro for specifying the name of a parameter that
	10978	@code{yyparse} should accept. The use of this macro is deprecated, and
	10979	is supported only for Yacc like parsers. @xref{Pure Calling,, Calling
	10980	Conventions for Pure Parsers}.
	10981	@end deffn
	10982
	10983	@deffn {Macro} YYRECOVERING
	10984	The expression @code{YYRECOVERING ()} yields 1 when the parser
	10985	is recovering from a syntax error, and 0 otherwise.
	10986	@xref{Action Features, ,Special Features for Use in Actions}.
	10987	@end deffn
	10988
	10989	@deffn {Macro} YYSTACK_USE_ALLOCA
	10990	Macro used to control the use of @code{alloca} when the
	10991	deterministic parser in C needs to extend its stacks. If defined to 0,
	10992	the parser will use @code{malloc} to extend its stacks. If defined to
	10993	1, the parser will use @code{alloca}. Values other than 0 and 1 are
	10994	reserved for future Bison extensions. If not defined,
	10995	@code{YYSTACK_USE_ALLOCA} defaults to 0.
	10996
	10997	In the all-too-common case where your code may run on a host with a
	10998	limited stack and with unreliable stack-overflow checking, you should
	10999	set @code{YYMAXDEPTH} to a value that cannot possibly result in
	11000	unchecked stack overflow on any of your target hosts when
	11001	@code{alloca} is called. You can inspect the code that Bison
	11002	generates in order to determine the proper numeric values. This will
	11003	require some expertise in low-level implementation details.
	11004	@end deffn
	11005
	11006	@deffn {Type} YYSTYPE
	11007	Data type of semantic values; @code{int} by default.
	11008	@xref{Value Type, ,Data Types of Semantic Values}.
	11009	@end deffn
	11010
	11011	@node Glossary
	11012	@appendix Glossary
	11013	@cindex glossary
	11014
	11015	@table @asis
	11016	@item Accepting state
	11017	A state whose only action is the accept action.
	11018	The accepting state is thus a consistent state.
	11019	@xref{Understanding,,}.
	11020
	11021	@item Backus-Naur Form (BNF; also called ``Backus Normal Form'')
	11022	Formal method of specifying context-free grammars originally proposed
	11023	by John Backus, and slightly improved by Peter Naur in his 1960-01-02
	11024	committee document contributing to what became the Algol 60 report.
	11025	@xref{Language and Grammar, ,Languages and Context-Free Grammars}.
	11026
	11027	@item Consistent state
	11028	A state containing only one possible action. @xref{Default Reductions}.
	11029
	11030	@item Context-free grammars
	11031	Grammars specified as rules that can be applied regardless of context.
	11032	Thus, if there is a rule which says that an integer can be used as an
	11033	expression, integers are allowed @emph{anywhere} an expression is
	11034	permitted. @xref{Language and Grammar, ,Languages and Context-Free
	11035	Grammars}.
	11036
	11037	@item Default reduction
	11038	The reduction that a parser should perform if the current parser state
	11039	contains no other action for the lookahead token. In permitted parser
	11040	states, Bison declares the reduction with the largest lookahead set to be
	11041	the default reduction and removes that lookahead set. @xref{Default
	11042	Reductions}.
	11043
	11044	@item Defaulted state
	11045	A consistent state with a default reduction. @xref{Default Reductions}.
	11046
	11047	@item Dynamic allocation
	11048	Allocation of memory that occurs during execution, rather than at
	11049	compile time or on entry to a function.
	11050
	11051	@item Empty string
	11052	Analogous to the empty set in set theory, the empty string is a
	11053	character string of length zero.
	11054
	11055	@item Finite-state stack machine
	11056	A ``machine'' that has discrete states in which it is said to exist at
	11057	each instant in time. As input to the machine is processed, the
	11058	machine moves from state to state as specified by the logic of the
	11059	machine. In the case of the parser, the input is the language being
	11060	parsed, and the states correspond to various stages in the grammar
	11061	rules. @xref{Algorithm, ,The Bison Parser Algorithm}.
	11062
	11063	@item Generalized LR (GLR)
	11064	A parsing algorithm that can handle all context-free grammars, including those
	11065	that are not LR(1). It resolves situations that Bison's
	11066	deterministic parsing
	11067	algorithm cannot by effectively splitting off multiple parsers, trying all
	11068	possible parsers, and discarding those that fail in the light of additional
	11069	right context. @xref{Generalized LR Parsing, ,Generalized
	11070	LR Parsing}.
	11071
	11072	@item Grouping
	11073	A language construct that is (in general) grammatically divisible;
	11074	for example, `expression' or `declaration' in C@.
	11075	@xref{Language and Grammar, ,Languages and Context-Free Grammars}.
	11076
	11077	@item IELR(1) (Inadequacy Elimination LR(1))
	11078	A minimal LR(1) parser table construction algorithm. That is, given any
	11079	context-free grammar, IELR(1) generates parser tables with the full
	11080	language-recognition power of canonical LR(1) but with nearly the same
	11081	number of parser states as LALR(1). This reduction in parser states is
	11082	often an order of magnitude. More importantly, because canonical LR(1)'s
	11083	extra parser states may contain duplicate conflicts in the case of non-LR(1)
	11084	grammars, the number of conflicts for IELR(1) is often an order of magnitude
	11085	less as well. This can significantly reduce the complexity of developing a
	11086	grammar. @xref{LR Table Construction}.
	11087
	11088	@item Infix operator
	11089	An arithmetic operator that is placed between the operands on which it
	11090	performs some operation.
	11091
	11092	@item Input stream
	11093	A continuous flow of data between devices or programs.
	11094
	11095	@item LAC (Lookahead Correction)
	11096	A parsing mechanism that fixes the problem of delayed syntax error
	11097	detection, which is caused by LR state merging, default reductions, and the
	11098	use of @code{%nonassoc}. Delayed syntax error detection results in
	11099	unexpected semantic actions, initiation of error recovery in the wrong
	11100	syntactic context, and an incorrect list of expected tokens in a verbose
	11101	syntax error message. @xref{LAC}.
	11102
	11103	@item Language construct
	11104	One of the typical usage schemas of the language. For example, one of
	11105	the constructs of the C language is the @code{if} statement.
	11106	@xref{Language and Grammar, ,Languages and Context-Free Grammars}.
	11107
	11108	@item Left associativity
	11109	Operators having left associativity are analyzed from left to right:
	11110	@samp{a+b+c} first computes @samp{a+b} and then combines with
	11111	@samp{c}. @xref{Precedence, ,Operator Precedence}.
	11112
	11113	@item Left recursion
	11114	A rule whose result symbol is also its first component symbol; for
	11115	example, @samp{expseq1 : expseq1 ',' exp;}. @xref{Recursion, ,Recursive
	11116	Rules}.
	11117
	11118	@item Left-to-right parsing
	11119	Parsing a sentence of a language by analyzing it token by token from
	11120	left to right. @xref{Algorithm, ,The Bison Parser Algorithm}.
	11121
	11122	@item Lexical analyzer (scanner)
	11123	A function that reads an input stream and returns tokens one by one.
	11124	@xref{Lexical, ,The Lexical Analyzer Function @code{yylex}}.
	11125
	11126	@item Lexical tie-in
	11127	A flag, set by actions in the grammar rules, which alters the way
	11128	tokens are parsed. @xref{Lexical Tie-ins}.
	11129
	11130	@item Literal string token
	11131	A token which consists of two or more fixed characters. @xref{Symbols}.
	11132
	11133	@item Lookahead token
	11134	A token already read but not yet shifted. @xref{Lookahead, ,Lookahead
	11135	Tokens}.
	11136
	11137	@item LALR(1)
	11138	The class of context-free grammars that Bison (like most other parser
	11139	generators) can handle by default; a subset of LR(1).
	11140	@xref{Mysterious Conflicts}.
	11141
	11142	@item LR(1)
	11143	The class of context-free grammars in which at most one token of
	11144	lookahead is needed to disambiguate the parsing of any piece of input.
	11145
	11146	@item Nonterminal symbol
	11147	A grammar symbol standing for a grammatical construct that can
	11148	be expressed through rules in terms of smaller constructs; in other
	11149	words, a construct that is not a token. @xref{Symbols}.
	11150
	11151	@item Parser
	11152	A function that recognizes valid sentences of a language by analyzing
	11153	the syntax structure of a set of tokens passed to it from a lexical
	11154	analyzer.
	11155
	11156	@item Postfix operator
	11157	An arithmetic operator that is placed after the operands upon which it
	11158	performs some operation.
	11159
	11160	@item Reduction
	11161	Replacing a string of nonterminals and/or terminals with a single
	11162	nonterminal, according to a grammar rule. @xref{Algorithm, ,The Bison
	11163	Parser Algorithm}.
	11164
	11165	@item Reentrant
	11166	A reentrant subprogram is a subprogram which can be in invoked any
	11167	number of times in parallel, without interference between the various
	11168	invocations. @xref{Pure Decl, ,A Pure (Reentrant) Parser}.
	11169
	11170	@item Reverse polish notation
	11171	A language in which all operators are postfix operators.
	11172
	11173	@item Right recursion
	11174	A rule whose result symbol is also its last component symbol; for
	11175	example, @samp{expseq1: exp ',' expseq1;}. @xref{Recursion, ,Recursive
	11176	Rules}.
	11177
	11178	@item Semantics
	11179	In computer languages, the semantics are specified by the actions
	11180	taken for each instance of the language, i.e., the meaning of
	11181	each statement. @xref{Semantics, ,Defining Language Semantics}.
	11182
	11183	@item Shift
	11184	A parser is said to shift when it makes the choice of analyzing
	11185	further input from the stream rather than reducing immediately some
	11186	already-recognized rule. @xref{Algorithm, ,The Bison Parser Algorithm}.
	11187
	11188	@item Single-character literal
	11189	A single character that is recognized and interpreted as is.
	11190	@xref{Grammar in Bison, ,From Formal Rules to Bison Input}.
	11191
	11192	@item Start symbol
	11193	The nonterminal symbol that stands for a complete valid utterance in
	11194	the language being parsed. The start symbol is usually listed as the
	11195	first nonterminal symbol in a language specification.
	11196	@xref{Start Decl, ,The Start-Symbol}.
	11197
	11198	@item Symbol table
	11199	A data structure where symbol names and associated data are stored
	11200	during parsing to allow for recognition and use of existing
	11201	information in repeated uses of a symbol. @xref{Multi-function Calc}.
	11202
	11203	@item Syntax error
	11204	An error encountered during parsing of an input stream due to invalid
	11205	syntax. @xref{Error Recovery}.
	11206
	11207	@item Token
	11208	A basic, grammatically indivisible unit of a language. The symbol
	11209	that describes a token in the grammar is a terminal symbol.
	11210	The input of the Bison parser is a stream of tokens which comes from
	11211	the lexical analyzer. @xref{Symbols}.
	11212
	11213	@item Terminal symbol
	11214	A grammar symbol that has no rules in the grammar and therefore is
	11215	grammatically indivisible. The piece of text it represents is a token.
	11216	@xref{Language and Grammar, ,Languages and Context-Free Grammars}.
	11217
	11218	@item Unreachable state
	11219	A parser state to which there does not exist a sequence of transitions from
	11220	the parser's start state. A state can become unreachable during conflict
	11221	resolution. @xref{Unreachable States}.
	11222	@end table
	11223
	11224	@node Copying This Manual
	11225	@appendix Copying This Manual
	11226	@include fdl.texi
	11227
	11228	@node Bibliography
	11229	@unnumbered Bibliography
	11230
	11231	@table @asis
	11232	@item [Denny 2008]
	11233	Joel E. Denny and Brian A. Malloy, IELR(1): Practical LR(1) Parser Tables
	11234	for Non-LR(1) Grammars with Conflict Resolution, in @cite{Proceedings of the
	11235	2008 ACM Symposium on Applied Computing} (SAC'08), ACM, New York, NY, USA,
	11236	pp.@: 240--245. @uref{http://dx.doi.org/10.1145/1363686.1363747}
	11237
	11238	@item [Denny 2010 May]
	11239	Joel E. Denny, PSLR(1): Pseudo-Scannerless Minimal LR(1) for the
	11240	Deterministic Parsing of Composite Languages, Ph.D. Dissertation, Clemson
	11241	University, Clemson, SC, USA (May 2010).
	11242	@uref{http://proquest.umi.com/pqdlink?did=2041473591&Fmt=7&clientId=79356&RQT=309&VName=PQD}
	11243
	11244	@item [Denny 2010 November]
	11245	Joel E. Denny and Brian A. Malloy, The IELR(1) Algorithm for Generating
	11246	Minimal LR(1) Parser Tables for Non-LR(1) Grammars with Conflict Resolution,
	11247	in @cite{Science of Computer Programming}, Vol.@: 75, Issue 11 (November
	11248	2010), pp.@: 943--979. @uref{http://dx.doi.org/10.1016/j.scico.2009.08.001}
	11249
	11250	@item [DeRemer 1982]
	11251	Frank DeRemer and Thomas Pennello, Efficient Computation of LALR(1)
	11252	Look-Ahead Sets, in @cite{ACM Transactions on Programming Languages and
	11253	Systems}, Vol.@: 4, No.@: 4 (October 1982), pp.@:
	11254	615--649. @uref{http://dx.doi.org/10.1145/69622.357187}
	11255
	11256	@item [Knuth 1965]
	11257	Donald E. Knuth, On the Translation of Languages from Left to Right, in
	11258	@cite{Information and Control}, Vol.@: 8, Issue 6 (December 1965), pp.@:
	11259	607--639. @uref{http://dx.doi.org/10.1016/S0019-9958(65)90426-2}
	11260
	11261	@item [Scott 2000]
	11262	Elizabeth Scott, Adrian Johnstone, and Shamsa Sadaf Hussain,
	11263	@cite{Tomita-Style Generalised LR Parsers}, Royal Holloway, University of
	11264	London, Department of Computer Science, TR-00-12 (December 2000).
	11265	@uref{http://www.cs.rhul.ac.uk/research/languages/publications/tomita_style_1.ps}
	11266	@end table
	11267
	11268	@node Index
	11269	@unnumbered Index
	11270
	11271	@printindex cp
	11272
	11273	@bye
	11274
	11275	@c LocalWords: texinfo setfilename settitle setchapternewpage finalout texi FSF
	11276	@c LocalWords: ifinfo smallbook shorttitlepage titlepage GPL FIXME iftex FSF's
	11277	@c LocalWords: akim fn cp syncodeindex vr tp synindex dircategory direntry Naur
	11278	@c LocalWords: ifset vskip pt filll insertcopying sp ISBN Etienne Suvasa Multi
	11279	@c LocalWords: ifnottex yyparse detailmenu GLR RPN Calc var Decls Rpcalc multi
	11280	@c LocalWords: rpcalc Lexer Expr ltcalc mfcalc yylex defaultprec Donnelly Gotos
	11281	@c LocalWords: yyerror pxref LR yylval cindex dfn LALR samp gpl BNF xref yypush
	11282	@c LocalWords: const int paren ifnotinfo AC noindent emph expr stmt findex lr
	11283	@c LocalWords: glr YYSTYPE TYPENAME prog dprec printf decl init stmtMerge POSIX
	11284	@c LocalWords: pre STDC GNUC endif yy YY alloca lf stddef stdlib YYDEBUG yypull
	11285	@c LocalWords: NUM exp subsubsection kbd Ctrl ctype EOF getchar isdigit nonfree
	11286	@c LocalWords: ungetc stdin scanf sc calc ulator ls lm cc NEG prec yyerrok rr
	11287	@c LocalWords: longjmp fprintf stderr yylloc YYLTYPE cos ln Stallman Destructor
	11288	@c LocalWords: symrec val tptr FNCT fnctptr func struct sym enum IEC syntaxes
	11289	@c LocalWords: fnct putsym getsym fname arith fncts atan ptr malloc sizeof Lex
	11290	@c LocalWords: strlen strcpy fctn strcmp isalpha symbuf realloc isalnum DOTDOT
	11291	@c LocalWords: ptypes itype YYPRINT trigraphs yytname expseq vindex dtype Unary
	11292	@c LocalWords: Rhs YYRHSLOC LE nonassoc op deffn typeless yynerrs nonterminal
	11293	@c LocalWords: yychar yydebug msg YYNTOKENS YYNNTS YYNRULES YYNSTATES reentrant
	11294	@c LocalWords: cparse clex deftypefun NE defmac YYACCEPT YYABORT param yypstate
	11295	@c LocalWords: strncmp intval tindex lvalp locp llocp typealt YYBACKUP subrange
	11296	@c LocalWords: YYEMPTY YYEOF YYRECOVERING yyclearin GE def UMINUS maybeword loc
	11297	@c LocalWords: Johnstone Shamsa Sadaf Hussain Tomita TR uref YYMAXDEPTH inline
	11298	@c LocalWords: YYINITDEPTH stmts ref initdcl maybeasm notype Lookahead yyoutput
	11299	@c LocalWords: hexflag STR exdent itemset asis DYYDEBUG YYFPRINTF args Autoconf
	11300	@c LocalWords: infile ypp yxx outfile itemx tex leaderfill Troubleshouting sqrt
	11301	@c LocalWords: hbox hss hfill tt ly yyin fopen fclose ofirst gcc ll lookahead
	11302	@c LocalWords: nbar yytext fst snd osplit ntwo strdup AST Troublereporting th
	11303	@c LocalWords: YYSTACK DVI fdl printindex IELR nondeterministic nonterminals ps
	11304	@c LocalWords: subexpressions declarator nondeferred config libintl postfix LAC
	11305	@c LocalWords: preprocessor nonpositive unary nonnumeric typedef extern rhs sr
	11306	@c LocalWords: yytokentype destructor multicharacter nonnull EBCDIC nterm LR's
	11307	@c LocalWords: lvalue nonnegative XNUM CHR chr TAGLESS tagless stdout api TOK
	11308	@c LocalWords: destructors Reentrancy nonreentrant subgrammar nonassociative Ph
	11309	@c LocalWords: deffnx namespace xml goto lalr ielr runtime lex yacc yyps env
	11310	@c LocalWords: yystate variadic Unshift NLS gettext po UTF Automake LOCALEDIR
	11311	@c LocalWords: YYENABLE bindtextdomain Makefile DEFS CPPFLAGS DBISON DeRemer
	11312	@c LocalWords: autoreconf Pennello multisets nondeterminism Generalised baz ACM
	11313	@c LocalWords: redeclare automata Dparse localedir datadir XSLT midrule Wno
	11314	@c LocalWords: Graphviz multitable headitem hh basename Doxygen fno filename
	11315	@c LocalWords: doxygen ival sval deftypemethod deallocate pos deftypemethodx
	11316	@c LocalWords: Ctor defcv defcvx arg accessors arithmetics CPP ifndef CALCXX
	11317	@c LocalWords: lexer's calcxx bool LPAREN RPAREN deallocation cerrno climits
	11318	@c LocalWords: cstdlib Debian undef yywrap unput noyywrap nounput zA yyleng
	11319	@c LocalWords: errno strtol ERANGE str strerror iostream argc argv Javadoc PSLR
	11320	@c LocalWords: bytecode initializers superclass stype ASTNode autoboxing nls
	11321	@c LocalWords: toString deftypeivar deftypeivarx deftypeop YYParser strictfp
	11322	@c LocalWords: superclasses boolean getErrorVerbose setErrorVerbose deftypecv
	11323	@c LocalWords: getDebugStream setDebugStream getDebugLevel setDebugLevel url
	11324	@c LocalWords: bisonVersion deftypecvx bisonSkeleton getStartPos getEndPos
	11325	@c LocalWords: getLVal defvar deftypefn deftypefnx gotos msgfmt Corbett LALR's
	11326	@c LocalWords: subdirectory Solaris nonassociativity perror schemas Malloy
	11327	@c LocalWords: Scannerless ispell american
	11328
	11329	@c Local Variables:
	11330	@c ispell-dictionary: "american"
	11331	@c fill-column: 76
	11332	@c End: