git.saurik.com Git - bison.git/blame_incremental

... / ...

Commit	Line	Data
	1	\input texinfo @c --texinfo--
	2	@comment %**start of header
	3	@setfilename bison.info
	4	@include version.texi
	5	@settitle Bison @value{VERSION}
	6	@setchapternewpage odd
	7
	8	@finalout
	9
	10	@c SMALL BOOK version
	11	@c This edition has been formatted so that you can format and print it in
	12	@c the smallbook format.
	13	@c @smallbook
	14
	15	@c Set following if you want to document %default-prec and %no-default-prec.
	16	@c This feature is experimental and may change in future Bison versions.
	17	@c @set defaultprec
	18
	19	@ifnotinfo
	20	@syncodeindex fn cp
	21	@syncodeindex vr cp
	22	@syncodeindex tp cp
	23	@end ifnotinfo
	24	@ifinfo
	25	@synindex fn cp
	26	@synindex vr cp
	27	@synindex tp cp
	28	@end ifinfo
	29	@comment %**end of header
	30
	31	@copying
	32
	33	This manual (@value{UPDATED}) is for GNU Bison (version
	34	@value{VERSION}), the GNU parser generator.
	35
	36	Copyright @copyright{} 1988-1993, 1995, 1998-2013 Free Software
	37	Foundation, Inc.
	38
	39	@quotation
	40	Permission is granted to copy, distribute and/or modify this document
	41	under the terms of the GNU Free Documentation License,
	42	Version 1.3 or any later version published by the Free Software
	43	Foundation; with no Invariant Sections, with the Front-Cover texts
	44	being ``A GNU Manual,'' and with the Back-Cover Texts as in
	45	(a) below. A copy of the license is included in the section entitled
	46	``GNU Free Documentation License.''
	47
	48	(a) The FSF's Back-Cover Text is: ``You have the freedom to copy and
	49	modify this GNU manual. Buying copies from the FSF
	50	supports it in developing GNU and promoting software
	51	freedom.''
	52	@end quotation
	53	@end copying
	54
	55	@dircategory Software development
	56	@direntry
	57	* bison: (bison). GNU parser generator (Yacc replacement).
	58	@end direntry
	59
	60	@titlepage
	61	@title Bison
	62	@subtitle The Yacc-compatible Parser Generator
	63	@subtitle @value{UPDATED}, Bison Version @value{VERSION}
	64
	65	@author by Charles Donnelly and Richard Stallman
	66
	67	@page
	68	@vskip 0pt plus 1filll
	69	@insertcopying
	70	@sp 2
	71	Published by the Free Software Foundation @*
	72	51 Franklin Street, Fifth Floor @*
	73	Boston, MA 02110-1301 USA @*
	74	Printed copies are available from the Free Software Foundation.@*
	75	ISBN 1-882114-44-2
	76	@sp 2
	77	Cover art by Etienne Suvasa.
	78	@end titlepage
	79
	80	@contents
	81
	82	@ifnottex
	83	@node Top
	84	@top Bison
	85	@insertcopying
	86	@end ifnottex
	87
	88	@menu
	89	* Introduction::
	90	* Conditions::
	91	* Copying:: The GNU General Public License says
	92	how you can copy and share Bison.
	93
	94	Tutorial sections:
	95	* Concepts:: Basic concepts for understanding Bison.
	96	* Examples:: Three simple explained examples of using Bison.
	97
	98	Reference sections:
	99	* Grammar File:: Writing Bison declarations and rules.
	100	* Interface:: C-language interface to the parser function @code{yyparse}.
	101	* Algorithm:: How the Bison parser works at run-time.
	102	* Error Recovery:: Writing rules for error recovery.
	103	* Context Dependency:: What to do if your language syntax is too
	104	messy for Bison to handle straightforwardly.
	105	* Debugging:: Understanding or debugging Bison parsers.
	106	* Invocation:: How to run Bison (to produce the parser implementation).
	107	* Other Languages:: Creating C++ and Java parsers.
	108	* FAQ:: Frequently Asked Questions
	109	* Table of Symbols:: All the keywords of the Bison language are explained.
	110	* Glossary:: Basic concepts are explained.
	111	* Copying This Manual:: License for copying this manual.
	112	* Bibliography:: Publications cited in this manual.
	113	* Index of Terms:: Cross-references to the text.
	114
	115	@detailmenu
	116	--- The Detailed Node Listing ---
	117
	118	The Concepts of Bison
	119
	120	* Language and Grammar:: Languages and context-free grammars,
	121	as mathematical ideas.
	122	* Grammar in Bison:: How we represent grammars for Bison's sake.
	123	* Semantic Values:: Each token or syntactic grouping can have
	124	a semantic value (the value of an integer,
	125	the name of an identifier, etc.).
	126	* Semantic Actions:: Each rule can have an action containing C code.
	127	* GLR Parsers:: Writing parsers for general context-free languages.
	128	* Locations:: Overview of location tracking.
	129	* Bison Parser:: What are Bison's input and output,
	130	how is the output used?
	131	* Stages:: Stages in writing and running Bison grammars.
	132	* Grammar Layout:: Overall structure of a Bison grammar file.
	133
	134	Writing GLR Parsers
	135
	136	* Simple GLR Parsers:: Using GLR parsers on unambiguous grammars.
	137	* Merging GLR Parses:: Using GLR parsers to resolve ambiguities.
	138	* GLR Semantic Actions:: Considerations for semantic values and deferred actions.
	139	* Semantic Predicates:: Controlling a parse with arbitrary computations.
	140	* Compiler Requirements:: GLR parsers require a modern C compiler.
	141
	142	Examples
	143
	144	* RPN Calc:: Reverse polish notation calculator;
	145	a first example with no operator precedence.
	146	* Infix Calc:: Infix (algebraic) notation calculator.
	147	Operator precedence is introduced.
	148	* Simple Error Recovery:: Continuing after syntax errors.
	149	* Location Tracking Calc:: Demonstrating the use of @@@var{n} and @@$.
	150	* Multi-function Calc:: Calculator with memory and trig functions.
	151	It uses multiple data-types for semantic values.
	152	* Exercises:: Ideas for improving the multi-function calculator.
	153
	154	Reverse Polish Notation Calculator
	155
	156	* Rpcalc Declarations:: Prologue (declarations) for rpcalc.
	157	* Rpcalc Rules:: Grammar Rules for rpcalc, with explanation.
	158	* Rpcalc Lexer:: The lexical analyzer.
	159	* Rpcalc Main:: The controlling function.
	160	* Rpcalc Error:: The error reporting function.
	161	* Rpcalc Generate:: Running Bison on the grammar file.
	162	* Rpcalc Compile:: Run the C compiler on the output code.
	163
	164	Grammar Rules for @code{rpcalc}
	165
	166	* Rpcalc Input:: Explanation of the @code{input} nonterminal
	167	* Rpcalc Line:: Explanation of the @code{line} nonterminal
	168	* Rpcalc Expr:: Explanation of the @code{expr} nonterminal
	169
	170	Location Tracking Calculator: @code{ltcalc}
	171
	172	* Ltcalc Declarations:: Bison and C declarations for ltcalc.
	173	* Ltcalc Rules:: Grammar rules for ltcalc, with explanations.
	174	* Ltcalc Lexer:: The lexical analyzer.
	175
	176	Multi-Function Calculator: @code{mfcalc}
	177
	178	* Mfcalc Declarations:: Bison declarations for multi-function calculator.
	179	* Mfcalc Rules:: Grammar rules for the calculator.
	180	* Mfcalc Symbol Table:: Symbol table management subroutines.
	181	* Mfcalc Lexer:: The lexical analyzer.
	182	* Mfcalc Main:: The controlling function.
	183
	184	Bison Grammar Files
	185
	186	* Grammar Outline:: Overall layout of the grammar file.
	187	* Symbols:: Terminal and nonterminal symbols.
	188	* Rules:: How to write grammar rules.
	189	* Semantics:: Semantic values and actions.
	190	* Tracking Locations:: Locations and actions.
	191	* Named References:: Using named references in actions.
	192	* Declarations:: All kinds of Bison declarations are described here.
	193	* Multiple Parsers:: Putting more than one Bison parser in one program.
	194
	195	Outline of a Bison Grammar
	196
	197	* Prologue:: Syntax and usage of the prologue.
	198	* Prologue Alternatives:: Syntax and usage of alternatives to the prologue.
	199	* Bison Declarations:: Syntax and usage of the Bison declarations section.
	200	* Grammar Rules:: Syntax and usage of the grammar rules section.
	201	* Epilogue:: Syntax and usage of the epilogue.
	202
	203	Grammar Rules
	204
	205	* Rules Syntax:: Syntax of the rules.
	206	* Empty Rules:: Symbols that can match the empty string.
	207	* Recursion:: Writing recursive rules.
	208
	209
	210	Defining Language Semantics
	211
	212	* Value Type:: Specifying one data type for all semantic values.
	213	* Multiple Types:: Specifying several alternative data types.
	214	* Type Generation:: Generating the semantic value type.
	215	* Union Decl:: Declaring the set of all semantic value types.
	216	* Structured Value Type:: Providing a structured semantic value type.
	217	* Actions:: An action is the semantic definition of a grammar rule.
	218	* Action Types:: Specifying data types for actions to operate on.
	219	* Mid-Rule Actions:: Most actions go at the end of a rule.
	220	This says when, why and how to use the exceptional
	221	action in the middle of a rule.
	222
	223	Actions in Mid-Rule
	224
	225	* Using Mid-Rule Actions:: Putting an action in the middle of a rule.
	226	* Mid-Rule Action Translation:: How mid-rule actions are actually processed.
	227	* Mid-Rule Conflicts:: Mid-rule actions can cause conflicts.
	228
	229	Tracking Locations
	230
	231	* Location Type:: Specifying a data type for locations.
	232	* Actions and Locations:: Using locations in actions.
	233	* Location Default Action:: Defining a general way to compute locations.
	234
	235	Bison Declarations
	236
	237	* Require Decl:: Requiring a Bison version.
	238	* Token Decl:: Declaring terminal symbols.
	239	* Precedence Decl:: Declaring terminals with precedence and associativity.
	240	* Type Decl:: Declaring the choice of type for a nonterminal symbol.
	241	* Initial Action Decl:: Code run before parsing starts.
	242	* Destructor Decl:: Declaring how symbols are freed.
	243	* Printer Decl:: Declaring how symbol values are displayed.
	244	* Expect Decl:: Suppressing warnings about parsing conflicts.
	245	* Start Decl:: Specifying the start symbol.
	246	* Pure Decl:: Requesting a reentrant parser.
	247	* Push Decl:: Requesting a push parser.
	248	* Decl Summary:: Table of all Bison declarations.
	249	* %define Summary:: Defining variables to adjust Bison's behavior.
	250	* %code Summary:: Inserting code into the parser source.
	251
	252	Parser C-Language Interface
	253
	254	* Parser Function:: How to call @code{yyparse} and what it returns.
	255	* Push Parser Function:: How to call @code{yypush_parse} and what it returns.
	256	* Pull Parser Function:: How to call @code{yypull_parse} and what it returns.
	257	* Parser Create Function:: How to call @code{yypstate_new} and what it returns.
	258	* Parser Delete Function:: How to call @code{yypstate_delete} and what it returns.
	259	* Lexical:: You must supply a function @code{yylex}
	260	which reads tokens.
	261	* Error Reporting:: You must supply a function @code{yyerror}.
	262	* Action Features:: Special features for use in actions.
	263	* Internationalization:: How to let the parser speak in the user's
	264	native language.
	265
	266	The Lexical Analyzer Function @code{yylex}
	267
	268	* Calling Convention:: How @code{yyparse} calls @code{yylex}.
	269	* Token Values:: How @code{yylex} must return the semantic value
	270	of the token it has read.
	271	* Token Locations:: How @code{yylex} must return the text location
	272	(line number, etc.) of the token, if the
	273	actions want that.
	274	* Pure Calling:: How the calling convention differs in a pure parser
	275	(@pxref{Pure Decl, ,A Pure (Reentrant) Parser}).
	276
	277	The Bison Parser Algorithm
	278
	279	* Lookahead:: Parser looks one token ahead when deciding what to do.
	280	* Shift/Reduce:: Conflicts: when either shifting or reduction is valid.
	281	* Precedence:: Operator precedence works by resolving conflicts.
	282	* Contextual Precedence:: When an operator's precedence depends on context.
	283	* Parser States:: The parser is a finite-state-machine with stack.
	284	* Reduce/Reduce:: When two rules are applicable in the same situation.
	285	* Mysterious Conflicts:: Conflicts that look unjustified.
	286	* Tuning LR:: How to tune fundamental aspects of LR-based parsing.
	287	* Generalized LR Parsing:: Parsing arbitrary context-free grammars.
	288	* Memory Management:: What happens when memory is exhausted. How to avoid it.
	289
	290	Operator Precedence
	291
	292	* Why Precedence:: An example showing why precedence is needed.
	293	* Using Precedence:: How to specify precedence and associativity.
	294	* Precedence Only:: How to specify precedence only.
	295	* Precedence Examples:: How these features are used in the previous example.
	296	* How Precedence:: How they work.
	297	* Non Operators:: Using precedence for general conflicts.
	298
	299	Tuning LR
	300
	301	* LR Table Construction:: Choose a different construction algorithm.
	302	* Default Reductions:: Disable default reductions.
	303	* LAC:: Correct lookahead sets in the parser states.
	304	* Unreachable States:: Keep unreachable parser states for debugging.
	305
	306	Handling Context Dependencies
	307
	308	* Semantic Tokens:: Token parsing can depend on the semantic context.
	309	* Lexical Tie-ins:: Token parsing can depend on the syntactic context.
	310	* Tie-in Recovery:: Lexical tie-ins have implications for how
	311	error recovery rules must be written.
	312
	313	Debugging Your Parser
	314
	315	* Understanding:: Understanding the structure of your parser.
	316	* Graphviz:: Getting a visual representation of the parser.
	317	* Xml:: Getting a markup representation of the parser.
	318	* Tracing:: Tracing the execution of your parser.
	319
	320	Tracing Your Parser
	321
	322	* Enabling Traces:: Activating run-time trace support
	323	* Mfcalc Traces:: Extending @code{mfcalc} to support traces
	324	* The YYPRINT Macro:: Obsolete interface for semantic value reports
	325
	326	Invoking Bison
	327
	328	* Bison Options:: All the options described in detail,
	329	in alphabetical order by short options.
	330	* Option Cross Key:: Alphabetical list of long options.
	331	* Yacc Library:: Yacc-compatible @code{yylex} and @code{main}.
	332
	333	Parsers Written In Other Languages
	334
	335	* C++ Parsers:: The interface to generate C++ parser classes
	336	* Java Parsers:: The interface to generate Java parser classes
	337
	338	C++ Parsers
	339
	340	* C++ Bison Interface:: Asking for C++ parser generation
	341	* C++ Semantic Values:: %union vs. C++
	342	* C++ Location Values:: The position and location classes
	343	* C++ Parser Interface:: Instantiating and running the parser
	344	* C++ Scanner Interface:: Exchanges between yylex and parse
	345	* A Complete C++ Example:: Demonstrating their use
	346
	347	C++ Location Values
	348
	349	* C++ position:: One point in the source file
	350	* C++ location:: Two points in the source file
	351	* User Defined Location Type:: Required interface for locations
	352
	353	A Complete C++ Example
	354
	355	* Calc++ --- C++ Calculator:: The specifications
	356	* Calc++ Parsing Driver:: An active parsing context
	357	* Calc++ Parser:: A parser class
	358	* Calc++ Scanner:: A pure C++ Flex scanner
	359	* Calc++ Top Level:: Conducting the band
	360
	361	Java Parsers
	362
	363	* Java Bison Interface:: Asking for Java parser generation
	364	* Java Semantic Values:: %type and %token vs. Java
	365	* Java Location Values:: The position and location classes
	366	* Java Parser Interface:: Instantiating and running the parser
	367	* Java Scanner Interface:: Specifying the scanner for the parser
	368	* Java Action Features:: Special features for use in actions
	369	* Java Differences:: Differences between C/C++ and Java Grammars
	370	* Java Declarations Summary:: List of Bison declarations used with Java
	371
	372	Frequently Asked Questions
	373
	374	* Memory Exhausted:: Breaking the Stack Limits
	375	* How Can I Reset the Parser:: @code{yyparse} Keeps some State
	376	* Strings are Destroyed:: @code{yylval} Loses Track of Strings
	377	* Implementing Gotos/Loops:: Control Flow in the Calculator
	378	* Multiple start-symbols:: Factoring closely related grammars
	379	* Secure? Conform?:: Is Bison POSIX safe?
	380	* I can't build Bison:: Troubleshooting
	381	* Where can I find help?:: Troubleshouting
	382	* Bug Reports:: Troublereporting
	383	* More Languages:: Parsers in C++, Java, and so on
	384	* Beta Testing:: Experimenting development versions
	385	* Mailing Lists:: Meeting other Bison users
	386
	387	Copying This Manual
	388
	389	* Copying This Manual:: License for copying this manual.
	390
	391	@end detailmenu
	392	@end menu
	393
	394	@node Introduction
	395	@unnumbered Introduction
	396	@cindex introduction
	397
	398	@dfn{Bison} is a general-purpose parser generator that converts an
	399	annotated context-free grammar into a deterministic LR or generalized
	400	LR (GLR) parser employing LALR(1) parser tables. As an experimental
	401	feature, Bison can also generate IELR(1) or canonical LR(1) parser
	402	tables. Once you are proficient with Bison, you can use it to develop
	403	a wide range of language parsers, from those used in simple desk
	404	calculators to complex programming languages.
	405
	406	Bison is upward compatible with Yacc: all properly-written Yacc
	407	grammars ought to work with Bison with no change. Anyone familiar
	408	with Yacc should be able to use Bison with little trouble. You need
	409	to be fluent in C or C++ programming in order to use Bison or to
	410	understand this manual. Java is also supported as an experimental
	411	feature.
	412
	413	We begin with tutorial chapters that explain the basic concepts of
	414	using Bison and show three explained examples, each building on the
	415	last. If you don't know Bison or Yacc, start by reading these
	416	chapters. Reference chapters follow, which describe specific aspects
	417	of Bison in detail.
	418
	419	Bison was written originally by Robert Corbett. Richard Stallman made
	420	it Yacc-compatible. Wilfred Hansen of Carnegie Mellon University
	421	added multi-character string literals and other features. Since then,
	422	Bison has grown more robust and evolved many other new features thanks
	423	to the hard work of a long list of volunteers. For details, see the
	424	@file{THANKS} and @file{ChangeLog} files included in the Bison
	425	distribution.
	426
	427	This edition corresponds to version @value{VERSION} of Bison.
	428
	429	@node Conditions
	430	@unnumbered Conditions for Using Bison
	431
	432	The distribution terms for Bison-generated parsers permit using the
	433	parsers in nonfree programs. Before Bison version 2.2, these extra
	434	permissions applied only when Bison was generating LALR(1)
	435	parsers in C@. And before Bison version 1.24, Bison-generated
	436	parsers could be used only in programs that were free software.
	437
	438	The other GNU programming tools, such as the GNU C
	439	compiler, have never
	440	had such a requirement. They could always be used for nonfree
	441	software. The reason Bison was different was not due to a special
	442	policy decision; it resulted from applying the usual General Public
	443	License to all of the Bison source code.
	444
	445	The main output of the Bison utility---the Bison parser implementation
	446	file---contains a verbatim copy of a sizable piece of Bison, which is
	447	the code for the parser's implementation. (The actions from your
	448	grammar are inserted into this implementation at one point, but most
	449	of the rest of the implementation is not changed.) When we applied
	450	the GPL terms to the skeleton code for the parser's implementation,
	451	the effect was to restrict the use of Bison output to free software.
	452
	453	We didn't change the terms because of sympathy for people who want to
	454	make software proprietary. @strong{Software should be free.} But we
	455	concluded that limiting Bison's use to free software was doing little to
	456	encourage people to make other software free. So we decided to make the
	457	practical conditions for using Bison match the practical conditions for
	458	using the other GNU tools.
	459
	460	This exception applies when Bison is generating code for a parser.
	461	You can tell whether the exception applies to a Bison output file by
	462	inspecting the file for text beginning with ``As a special
	463	exception@dots{}''. The text spells out the exact terms of the
	464	exception.
	465
	466	@node Copying
	467	@unnumbered GNU GENERAL PUBLIC LICENSE
	468	@include gpl-3.0.texi
	469
	470	@node Concepts
	471	@chapter The Concepts of Bison
	472
	473	This chapter introduces many of the basic concepts without which the
	474	details of Bison will not make sense. If you do not already know how to
	475	use Bison or Yacc, we suggest you start by reading this chapter carefully.
	476
	477	@menu
	478	* Language and Grammar:: Languages and context-free grammars,
	479	as mathematical ideas.
	480	* Grammar in Bison:: How we represent grammars for Bison's sake.
	481	* Semantic Values:: Each token or syntactic grouping can have
	482	a semantic value (the value of an integer,
	483	the name of an identifier, etc.).
	484	* Semantic Actions:: Each rule can have an action containing C code.
	485	* GLR Parsers:: Writing parsers for general context-free languages.
	486	* Locations:: Overview of location tracking.
	487	* Bison Parser:: What are Bison's input and output,
	488	how is the output used?
	489	* Stages:: Stages in writing and running Bison grammars.
	490	* Grammar Layout:: Overall structure of a Bison grammar file.
	491	@end menu
	492
	493	@node Language and Grammar
	494	@section Languages and Context-Free Grammars
	495
	496	@cindex context-free grammar
	497	@cindex grammar, context-free
	498	In order for Bison to parse a language, it must be described by a
	499	@dfn{context-free grammar}. This means that you specify one or more
	500	@dfn{syntactic groupings} and give rules for constructing them from their
	501	parts. For example, in the C language, one kind of grouping is called an
	502	`expression'. One rule for making an expression might be, ``An expression
	503	can be made of a minus sign and another expression''. Another would be,
	504	``An expression can be an integer''. As you can see, rules are often
	505	recursive, but there must be at least one rule which leads out of the
	506	recursion.
	507
	508	@cindex BNF
	509	@cindex Backus-Naur form
	510	The most common formal system for presenting such rules for humans to read
	511	is @dfn{Backus-Naur Form} or ``BNF'', which was developed in
	512	order to specify the language Algol 60. Any grammar expressed in
	513	BNF is a context-free grammar. The input to Bison is
	514	essentially machine-readable BNF.
	515
	516	@cindex LALR grammars
	517	@cindex IELR grammars
	518	@cindex LR grammars
	519	There are various important subclasses of context-free grammars. Although
	520	it can handle almost all context-free grammars, Bison is optimized for what
	521	are called LR(1) grammars. In brief, in these grammars, it must be possible
	522	to tell how to parse any portion of an input string with just a single token
	523	of lookahead. For historical reasons, Bison by default is limited by the
	524	additional restrictions of LALR(1), which is hard to explain simply.
	525	@xref{Mysterious Conflicts}, for more information on this. As an
	526	experimental feature, you can escape these additional restrictions by
	527	requesting IELR(1) or canonical LR(1) parser tables. @xref{LR Table
	528	Construction}, to learn how.
	529
	530	@cindex GLR parsing
	531	@cindex generalized LR (GLR) parsing
	532	@cindex ambiguous grammars
	533	@cindex nondeterministic parsing
	534
	535	Parsers for LR(1) grammars are @dfn{deterministic}, meaning
	536	roughly that the next grammar rule to apply at any point in the input is
	537	uniquely determined by the preceding input and a fixed, finite portion
	538	(called a @dfn{lookahead}) of the remaining input. A context-free
	539	grammar can be @dfn{ambiguous}, meaning that there are multiple ways to
	540	apply the grammar rules to get the same inputs. Even unambiguous
	541	grammars can be @dfn{nondeterministic}, meaning that no fixed
	542	lookahead always suffices to determine the next grammar rule to apply.
	543	With the proper declarations, Bison is also able to parse these more
	544	general context-free grammars, using a technique known as GLR
	545	parsing (for Generalized LR). Bison's GLR parsers
	546	are able to handle any context-free grammar for which the number of
	547	possible parses of any given string is finite.
	548
	549	@cindex symbols (abstract)
	550	@cindex token
	551	@cindex syntactic grouping
	552	@cindex grouping, syntactic
	553	In the formal grammatical rules for a language, each kind of syntactic
	554	unit or grouping is named by a @dfn{symbol}. Those which are built by
	555	grouping smaller constructs according to grammatical rules are called
	556	@dfn{nonterminal symbols}; those which can't be subdivided are called
	557	@dfn{terminal symbols} or @dfn{token types}. We call a piece of input
	558	corresponding to a single terminal symbol a @dfn{token}, and a piece
	559	corresponding to a single nonterminal symbol a @dfn{grouping}.
	560
	561	We can use the C language as an example of what symbols, terminal and
	562	nonterminal, mean. The tokens of C are identifiers, constants (numeric
	563	and string), and the various keywords, arithmetic operators and
	564	punctuation marks. So the terminal symbols of a grammar for C include
	565	`identifier', `number', `string', plus one symbol for each keyword,
	566	operator or punctuation mark: `if', `return', `const', `static', `int',
	567	`char', `plus-sign', `open-brace', `close-brace', `comma' and many more.
	568	(These tokens can be subdivided into characters, but that is a matter of
	569	lexicography, not grammar.)
	570
	571	Here is a simple C function subdivided into tokens:
	572
	573	@example
	574	int /* @r{keyword `int'} */
	575	square (int x) /* @r{identifier, open-paren, keyword `int',}
	576	@r{identifier, close-paren} */
	577	@{ /* @r{open-brace} */
	578	return x * x; /* @r{keyword `return', identifier, asterisk,}
	579	@r{identifier, semicolon} */
	580	@} /* @r{close-brace} */
	581	@end example
	582
	583	The syntactic groupings of C include the expression, the statement, the
	584	declaration, and the function definition. These are represented in the
	585	grammar of C by nonterminal symbols `expression', `statement',
	586	`declaration' and `function definition'. The full grammar uses dozens of
	587	additional language constructs, each with its own nonterminal symbol, in
	588	order to express the meanings of these four. The example above is a
	589	function definition; it contains one declaration, and one statement. In
	590	the statement, each @samp{x} is an expression and so is @samp{x * x}.
	591
	592	Each nonterminal symbol must have grammatical rules showing how it is made
	593	out of simpler constructs. For example, one kind of C statement is the
	594	@code{return} statement; this would be described with a grammar rule which
	595	reads informally as follows:
	596
	597	@quotation
	598	A `statement' can be made of a `return' keyword, an `expression' and a
	599	`semicolon'.
	600	@end quotation
	601
	602	@noindent
	603	There would be many other rules for `statement', one for each kind of
	604	statement in C.
	605
	606	@cindex start symbol
	607	One nonterminal symbol must be distinguished as the special one which
	608	defines a complete utterance in the language. It is called the @dfn{start
	609	symbol}. In a compiler, this means a complete input program. In the C
	610	language, the nonterminal symbol `sequence of definitions and declarations'
	611	plays this role.
	612
	613	For example, @samp{1 + 2} is a valid C expression---a valid part of a C
	614	program---but it is not valid as an @emph{entire} C program. In the
	615	context-free grammar of C, this follows from the fact that `expression' is
	616	not the start symbol.
	617
	618	The Bison parser reads a sequence of tokens as its input, and groups the
	619	tokens using the grammar rules. If the input is valid, the end result is
	620	that the entire token sequence reduces to a single grouping whose symbol is
	621	the grammar's start symbol. If we use a grammar for C, the entire input
	622	must be a `sequence of definitions and declarations'. If not, the parser
	623	reports a syntax error.
	624
	625	@node Grammar in Bison
	626	@section From Formal Rules to Bison Input
	627	@cindex Bison grammar
	628	@cindex grammar, Bison
	629	@cindex formal grammar
	630
	631	A formal grammar is a mathematical construct. To define the language
	632	for Bison, you must write a file expressing the grammar in Bison syntax:
	633	a @dfn{Bison grammar} file. @xref{Grammar File, ,Bison Grammar Files}.
	634
	635	A nonterminal symbol in the formal grammar is represented in Bison input
	636	as an identifier, like an identifier in C@. By convention, it should be
	637	in lower case, such as @code{expr}, @code{stmt} or @code{declaration}.
	638
	639	The Bison representation for a terminal symbol is also called a @dfn{token
	640	type}. Token types as well can be represented as C-like identifiers. By
	641	convention, these identifiers should be upper case to distinguish them from
	642	nonterminals: for example, @code{INTEGER}, @code{IDENTIFIER}, @code{IF} or
	643	@code{RETURN}. A terminal symbol that stands for a particular keyword in
	644	the language should be named after that keyword converted to upper case.
	645	The terminal symbol @code{error} is reserved for error recovery.
	646	@xref{Symbols}.
	647
	648	A terminal symbol can also be represented as a character literal, just like
	649	a C character constant. You should do this whenever a token is just a
	650	single character (parenthesis, plus-sign, etc.): use that same character in
	651	a literal as the terminal symbol for that token.
	652
	653	A third way to represent a terminal symbol is with a C string constant
	654	containing several characters. @xref{Symbols}, for more information.
	655
	656	The grammar rules also have an expression in Bison syntax. For example,
	657	here is the Bison rule for a C @code{return} statement. The semicolon in
	658	quotes is a literal character token, representing part of the C syntax for
	659	the statement; the naked semicolon, and the colon, are Bison punctuation
	660	used in every rule.
	661
	662	@example
	663	stmt: RETURN expr ';' ;
	664	@end example
	665
	666	@noindent
	667	@xref{Rules, ,Syntax of Grammar Rules}.
	668
	669	@node Semantic Values
	670	@section Semantic Values
	671	@cindex semantic value
	672	@cindex value, semantic
	673
	674	A formal grammar selects tokens only by their classifications: for example,
	675	if a rule mentions the terminal symbol `integer constant', it means that
	676	@emph{any} integer constant is grammatically valid in that position. The
	677	precise value of the constant is irrelevant to how to parse the input: if
	678	@samp{x+4} is grammatical then @samp{x+1} or @samp{x+3989} is equally
	679	grammatical.
	680
	681	But the precise value is very important for what the input means once it is
	682	parsed. A compiler is useless if it fails to distinguish between 4, 1 and
	683	3989 as constants in the program! Therefore, each token in a Bison grammar
	684	has both a token type and a @dfn{semantic value}. @xref{Semantics,
	685	,Defining Language Semantics},
	686	for details.
	687
	688	The token type is a terminal symbol defined in the grammar, such as
	689	@code{INTEGER}, @code{IDENTIFIER} or @code{','}. It tells everything
	690	you need to know to decide where the token may validly appear and how to
	691	group it with other tokens. The grammar rules know nothing about tokens
	692	except their types.
	693
	694	The semantic value has all the rest of the information about the
	695	meaning of the token, such as the value of an integer, or the name of an
	696	identifier. (A token such as @code{','} which is just punctuation doesn't
	697	need to have any semantic value.)
	698
	699	For example, an input token might be classified as token type
	700	@code{INTEGER} and have the semantic value 4. Another input token might
	701	have the same token type @code{INTEGER} but value 3989. When a grammar
	702	rule says that @code{INTEGER} is allowed, either of these tokens is
	703	acceptable because each is an @code{INTEGER}. When the parser accepts the
	704	token, it keeps track of the token's semantic value.
	705
	706	Each grouping can also have a semantic value as well as its nonterminal
	707	symbol. For example, in a calculator, an expression typically has a
	708	semantic value that is a number. In a compiler for a programming
	709	language, an expression typically has a semantic value that is a tree
	710	structure describing the meaning of the expression.
	711
	712	@node Semantic Actions
	713	@section Semantic Actions
	714	@cindex semantic actions
	715	@cindex actions, semantic
	716
	717	In order to be useful, a program must do more than parse input; it must
	718	also produce some output based on the input. In a Bison grammar, a grammar
	719	rule can have an @dfn{action} made up of C statements. Each time the
	720	parser recognizes a match for that rule, the action is executed.
	721	@xref{Actions}.
	722
	723	Most of the time, the purpose of an action is to compute the semantic value
	724	of the whole construct from the semantic values of its parts. For example,
	725	suppose we have a rule which says an expression can be the sum of two
	726	expressions. When the parser recognizes such a sum, each of the
	727	subexpressions has a semantic value which describes how it was built up.
	728	The action for this rule should create a similar sort of value for the
	729	newly recognized larger expression.
	730
	731	For example, here is a rule that says an expression can be the sum of
	732	two subexpressions:
	733
	734	@example
	735	expr: expr '+' expr @{ $$ = $1 + $3; @} ;
	736	@end example
	737
	738	@noindent
	739	The action says how to produce the semantic value of the sum expression
	740	from the values of the two subexpressions.
	741
	742	@node GLR Parsers
	743	@section Writing GLR Parsers
	744	@cindex GLR parsing
	745	@cindex generalized LR (GLR) parsing
	746	@findex %glr-parser
	747	@cindex conflicts
	748	@cindex shift/reduce conflicts
	749	@cindex reduce/reduce conflicts
	750
	751	In some grammars, Bison's deterministic
	752	LR(1) parsing algorithm cannot decide whether to apply a
	753	certain grammar rule at a given point. That is, it may not be able to
	754	decide (on the basis of the input read so far) which of two possible
	755	reductions (applications of a grammar rule) applies, or whether to apply
	756	a reduction or read more of the input and apply a reduction later in the
	757	input. These are known respectively as @dfn{reduce/reduce} conflicts
	758	(@pxref{Reduce/Reduce}), and @dfn{shift/reduce} conflicts
	759	(@pxref{Shift/Reduce}).
	760
	761	To use a grammar that is not easily modified to be LR(1), a
	762	more general parsing algorithm is sometimes necessary. If you include
	763	@code{%glr-parser} among the Bison declarations in your file
	764	(@pxref{Grammar Outline}), the result is a Generalized LR
	765	(GLR) parser. These parsers handle Bison grammars that
	766	contain no unresolved conflicts (i.e., after applying precedence
	767	declarations) identically to deterministic parsers. However, when
	768	faced with unresolved shift/reduce and reduce/reduce conflicts,
	769	GLR parsers use the simple expedient of doing both,
	770	effectively cloning the parser to follow both possibilities. Each of
	771	the resulting parsers can again split, so that at any given time, there
	772	can be any number of possible parses being explored. The parsers
	773	proceed in lockstep; that is, all of them consume (shift) a given input
	774	symbol before any of them proceed to the next. Each of the cloned
	775	parsers eventually meets one of two possible fates: either it runs into
	776	a parsing error, in which case it simply vanishes, or it merges with
	777	another parser, because the two of them have reduced the input to an
	778	identical set of symbols.
	779
	780	During the time that there are multiple parsers, semantic actions are
	781	recorded, but not performed. When a parser disappears, its recorded
	782	semantic actions disappear as well, and are never performed. When a
	783	reduction makes two parsers identical, causing them to merge, Bison
	784	records both sets of semantic actions. Whenever the last two parsers
	785	merge, reverting to the single-parser case, Bison resolves all the
	786	outstanding actions either by precedences given to the grammar rules
	787	involved, or by performing both actions, and then calling a designated
	788	user-defined function on the resulting values to produce an arbitrary
	789	merged result.
	790
	791	@menu
	792	* Simple GLR Parsers:: Using GLR parsers on unambiguous grammars.
	793	* Merging GLR Parses:: Using GLR parsers to resolve ambiguities.
	794	* GLR Semantic Actions:: Considerations for semantic values and deferred actions.
	795	* Semantic Predicates:: Controlling a parse with arbitrary computations.
	796	* Compiler Requirements:: GLR parsers require a modern C compiler.
	797	@end menu
	798
	799	@node Simple GLR Parsers
	800	@subsection Using GLR on Unambiguous Grammars
	801	@cindex GLR parsing, unambiguous grammars
	802	@cindex generalized LR (GLR) parsing, unambiguous grammars
	803	@findex %glr-parser
	804	@findex %expect-rr
	805	@cindex conflicts
	806	@cindex reduce/reduce conflicts
	807	@cindex shift/reduce conflicts
	808
	809	In the simplest cases, you can use the GLR algorithm
	810	to parse grammars that are unambiguous but fail to be LR(1).
	811	Such grammars typically require more than one symbol of lookahead.
	812
	813	Consider a problem that
	814	arises in the declaration of enumerated and subrange types in the
	815	programming language Pascal. Here are some examples:
	816
	817	@example
	818	type subrange = lo .. hi;
	819	type enum = (a, b, c);
	820	@end example
	821
	822	@noindent
	823	The original language standard allows only numeric
	824	literals and constant identifiers for the subrange bounds (@samp{lo}
	825	and @samp{hi}), but Extended Pascal (ISO/IEC
	826	10206) and many other
	827	Pascal implementations allow arbitrary expressions there. This gives
	828	rise to the following situation, containing a superfluous pair of
	829	parentheses:
	830
	831	@example
	832	type subrange = (a) .. b;
	833	@end example
	834
	835	@noindent
	836	Compare this to the following declaration of an enumerated
	837	type with only one value:
	838
	839	@example
	840	type enum = (a);
	841	@end example
	842
	843	@noindent
	844	(These declarations are contrived, but they are syntactically
	845	valid, and more-complicated cases can come up in practical programs.)
	846
	847	These two declarations look identical until the @samp{..} token.
	848	With normal LR(1) one-token lookahead it is not
	849	possible to decide between the two forms when the identifier
	850	@samp{a} is parsed. It is, however, desirable
	851	for a parser to decide this, since in the latter case
	852	@samp{a} must become a new identifier to represent the enumeration
	853	value, while in the former case @samp{a} must be evaluated with its
	854	current meaning, which may be a constant or even a function call.
	855
	856	You could parse @samp{(a)} as an ``unspecified identifier in parentheses'',
	857	to be resolved later, but this typically requires substantial
	858	contortions in both semantic actions and large parts of the
	859	grammar, where the parentheses are nested in the recursive rules for
	860	expressions.
	861
	862	You might think of using the lexer to distinguish between the two
	863	forms by returning different tokens for currently defined and
	864	undefined identifiers. But if these declarations occur in a local
	865	scope, and @samp{a} is defined in an outer scope, then both forms
	866	are possible---either locally redefining @samp{a}, or using the
	867	value of @samp{a} from the outer scope. So this approach cannot
	868	work.
	869
	870	A simple solution to this problem is to declare the parser to
	871	use the GLR algorithm.
	872	When the GLR parser reaches the critical state, it
	873	merely splits into two branches and pursues both syntax rules
	874	simultaneously. Sooner or later, one of them runs into a parsing
	875	error. If there is a @samp{..} token before the next
	876	@samp{;}, the rule for enumerated types fails since it cannot
	877	accept @samp{..} anywhere; otherwise, the subrange type rule
	878	fails since it requires a @samp{..} token. So one of the branches
	879	fails silently, and the other one continues normally, performing
	880	all the intermediate actions that were postponed during the split.
	881
	882	If the input is syntactically incorrect, both branches fail and the parser
	883	reports a syntax error as usual.
	884
	885	The effect of all this is that the parser seems to ``guess'' the
	886	correct branch to take, or in other words, it seems to use more
	887	lookahead than the underlying LR(1) algorithm actually allows
	888	for. In this example, LR(2) would suffice, but also some cases
	889	that are not LR(@math{k}) for any @math{k} can be handled this way.
	890
	891	In general, a GLR parser can take quadratic or cubic worst-case time,
	892	and the current Bison parser even takes exponential time and space
	893	for some grammars. In practice, this rarely happens, and for many
	894	grammars it is possible to prove that it cannot happen.
	895	The present example contains only one conflict between two
	896	rules, and the type-declaration context containing the conflict
	897	cannot be nested. So the number of
	898	branches that can exist at any time is limited by the constant 2,
	899	and the parsing time is still linear.
	900
	901	Here is a Bison grammar corresponding to the example above. It
	902	parses a vastly simplified form of Pascal type declarations.
	903
	904	@example
	905	%token TYPE DOTDOT ID
	906
	907	@group
	908	%left '+' '-'
	909	%left '*' '/'
	910	@end group
	911
	912	%%
	913	type_decl: TYPE ID '=' type ';' ;
	914
	915	@group
	916	type:
	917	'(' id_list ')'
	918	\| expr DOTDOT expr
	919	;
	920	@end group
	921
	922	@group
	923	id_list:
	924	ID
	925	\| id_list ',' ID
	926	;
	927	@end group
	928
	929	@group
	930	expr:
	931	'(' expr ')'
	932	\| expr '+' expr
	933	\| expr '-' expr
	934	\| expr '*' expr
	935	\| expr '/' expr
	936	\| ID
	937	;
	938	@end group
	939	@end example
	940
	941	When used as a normal LR(1) grammar, Bison correctly complains
	942	about one reduce/reduce conflict. In the conflicting situation the
	943	parser chooses one of the alternatives, arbitrarily the one
	944	declared first. Therefore the following correct input is not
	945	recognized:
	946
	947	@example
	948	type t = (a) .. b;
	949	@end example
	950
	951	The parser can be turned into a GLR parser, while also telling Bison
	952	to be silent about the one known reduce/reduce conflict, by adding
	953	these two declarations to the Bison grammar file (before the first
	954	@samp{%%}):
	955
	956	@example
	957	%glr-parser
	958	%expect-rr 1
	959	@end example
	960
	961	@noindent
	962	No change in the grammar itself is required. Now the
	963	parser recognizes all valid declarations, according to the
	964	limited syntax above, transparently. In fact, the user does not even
	965	notice when the parser splits.
	966
	967	So here we have a case where we can use the benefits of GLR,
	968	almost without disadvantages. Even in simple cases like this, however,
	969	there are at least two potential problems to beware. First, always
	970	analyze the conflicts reported by Bison to make sure that GLR
	971	splitting is only done where it is intended. A GLR parser
	972	splitting inadvertently may cause problems less obvious than an
	973	LR parser statically choosing the wrong alternative in a
	974	conflict. Second, consider interactions with the lexer (@pxref{Semantic
	975	Tokens}) with great care. Since a split parser consumes tokens without
	976	performing any actions during the split, the lexer cannot obtain
	977	information via parser actions. Some cases of lexer interactions can be
	978	eliminated by using GLR to shift the complications from the
	979	lexer to the parser. You must check the remaining cases for
	980	correctness.
	981
	982	In our example, it would be safe for the lexer to return tokens based on
	983	their current meanings in some symbol table, because no new symbols are
	984	defined in the middle of a type declaration. Though it is possible for
	985	a parser to define the enumeration constants as they are parsed, before
	986	the type declaration is completed, it actually makes no difference since
	987	they cannot be used within the same enumerated type declaration.
	988
	989	@node Merging GLR Parses
	990	@subsection Using GLR to Resolve Ambiguities
	991	@cindex GLR parsing, ambiguous grammars
	992	@cindex generalized LR (GLR) parsing, ambiguous grammars
	993	@findex %dprec
	994	@findex %merge
	995	@cindex conflicts
	996	@cindex reduce/reduce conflicts
	997
	998	Let's consider an example, vastly simplified from a C++ grammar.
	999
	1000	@example
	1001	%@{
	1002	#include <stdio.h>
	1003	#define YYSTYPE char const *
	1004	int yylex (void);
	1005	void yyerror (char const *);
	1006	%@}
	1007
	1008	%token TYPENAME ID
	1009
	1010	%right '='
	1011	%left '+'
	1012
	1013	%glr-parser
	1014
	1015	%%
	1016
	1017	prog:
	1018	%empty
	1019	\| prog stmt @{ printf ("\n"); @}
	1020	;
	1021
	1022	stmt:
	1023	expr ';' %dprec 1
	1024	\| decl %dprec 2
	1025	;
	1026
	1027	expr:
	1028	ID @{ printf ("%s ", $$); @}
	1029	\| TYPENAME '(' expr ')'
	1030	@{ printf ("%s <cast> ", $1); @}
	1031	\| expr '+' expr @{ printf ("+ "); @}
	1032	\| expr '=' expr @{ printf ("= "); @}
	1033	;
	1034
	1035	decl:
	1036	TYPENAME declarator ';'
	1037	@{ printf ("%s <declare> ", $1); @}
	1038	\| TYPENAME declarator '=' expr ';'
	1039	@{ printf ("%s <init-declare> ", $1); @}
	1040	;
	1041
	1042	declarator:
	1043	ID @{ printf ("\"%s\" ", $1); @}
	1044	\| '(' declarator ')'
	1045	;
	1046	@end example
	1047
	1048	@noindent
	1049	This models a problematic part of the C++ grammar---the ambiguity between
	1050	certain declarations and statements. For example,
	1051
	1052	@example
	1053	T (x) = y+z;
	1054	@end example
	1055
	1056	@noindent
	1057	parses as either an @code{expr} or a @code{stmt}
	1058	(assuming that @samp{T} is recognized as a @code{TYPENAME} and
	1059	@samp{x} as an @code{ID}).
	1060	Bison detects this as a reduce/reduce conflict between the rules
	1061	@code{expr : ID} and @code{declarator : ID}, which it cannot resolve at the
	1062	time it encounters @code{x} in the example above. Since this is a
	1063	GLR parser, it therefore splits the problem into two parses, one for
	1064	each choice of resolving the reduce/reduce conflict.
	1065	Unlike the example from the previous section (@pxref{Simple GLR Parsers}),
	1066	however, neither of these parses ``dies,'' because the grammar as it stands is
	1067	ambiguous. One of the parsers eventually reduces @code{stmt : expr ';'} and
	1068	the other reduces @code{stmt : decl}, after which both parsers are in an
	1069	identical state: they've seen @samp{prog stmt} and have the same unprocessed
	1070	input remaining. We say that these parses have @dfn{merged.}
	1071
	1072	At this point, the GLR parser requires a specification in the
	1073	grammar of how to choose between the competing parses.
	1074	In the example above, the two @code{%dprec}
	1075	declarations specify that Bison is to give precedence
	1076	to the parse that interprets the example as a
	1077	@code{decl}, which implies that @code{x} is a declarator.
	1078	The parser therefore prints
	1079
	1080	@example
	1081	"x" y z + T <init-declare>
	1082	@end example
	1083
	1084	The @code{%dprec} declarations only come into play when more than one
	1085	parse survives. Consider a different input string for this parser:
	1086
	1087	@example
	1088	T (x) + y;
	1089	@end example
	1090
	1091	@noindent
	1092	This is another example of using GLR to parse an unambiguous
	1093	construct, as shown in the previous section (@pxref{Simple GLR Parsers}).
	1094	Here, there is no ambiguity (this cannot be parsed as a declaration).
	1095	However, at the time the Bison parser encounters @code{x}, it does not
	1096	have enough information to resolve the reduce/reduce conflict (again,
	1097	between @code{x} as an @code{expr} or a @code{declarator}). In this
	1098	case, no precedence declaration is used. Again, the parser splits
	1099	into two, one assuming that @code{x} is an @code{expr}, and the other
	1100	assuming @code{x} is a @code{declarator}. The second of these parsers
	1101	then vanishes when it sees @code{+}, and the parser prints
	1102
	1103	@example
	1104	x T <cast> y +
	1105	@end example
	1106
	1107	Suppose that instead of resolving the ambiguity, you wanted to see all
	1108	the possibilities. For this purpose, you must merge the semantic
	1109	actions of the two possible parsers, rather than choosing one over the
	1110	other. To do so, you could change the declaration of @code{stmt} as
	1111	follows:
	1112
	1113	@example
	1114	stmt:
	1115	expr ';' %merge <stmtMerge>
	1116	\| decl %merge <stmtMerge>
	1117	;
	1118	@end example
	1119
	1120	@noindent
	1121	and define the @code{stmtMerge} function as:
	1122
	1123	@example
	1124	static YYSTYPE
	1125	stmtMerge (YYSTYPE x0, YYSTYPE x1)
	1126	@{
	1127	printf ("<OR> ");
	1128	return "";
	1129	@}
	1130	@end example
	1131
	1132	@noindent
	1133	with an accompanying forward declaration
	1134	in the C declarations at the beginning of the file:
	1135
	1136	@example
	1137	%@{
	1138	#define YYSTYPE char const *
	1139	static YYSTYPE stmtMerge (YYSTYPE x0, YYSTYPE x1);
	1140	%@}
	1141	@end example
	1142
	1143	@noindent
	1144	With these declarations, the resulting parser parses the first example
	1145	as both an @code{expr} and a @code{decl}, and prints
	1146
	1147	@example
	1148	"x" y z + T <init-declare> x T <cast> y z + = <OR>
	1149	@end example
	1150
	1151	Bison requires that all of the
	1152	productions that participate in any particular merge have identical
	1153	@samp{%merge} clauses. Otherwise, the ambiguity would be unresolvable,
	1154	and the parser will report an error during any parse that results in
	1155	the offending merge.
	1156
	1157	@node GLR Semantic Actions
	1158	@subsection GLR Semantic Actions
	1159
	1160	The nature of GLR parsing and the structure of the generated
	1161	parsers give rise to certain restrictions on semantic values and actions.
	1162
	1163	@subsubsection Deferred semantic actions
	1164	@cindex deferred semantic actions
	1165	By definition, a deferred semantic action is not performed at the same time as
	1166	the associated reduction.
	1167	This raises caveats for several Bison features you might use in a semantic
	1168	action in a GLR parser.
	1169
	1170	@vindex yychar
	1171	@cindex GLR parsers and @code{yychar}
	1172	@vindex yylval
	1173	@cindex GLR parsers and @code{yylval}
	1174	@vindex yylloc
	1175	@cindex GLR parsers and @code{yylloc}
	1176	In any semantic action, you can examine @code{yychar} to determine the type of
	1177	the lookahead token present at the time of the associated reduction.
	1178	After checking that @code{yychar} is not set to @code{YYEMPTY} or @code{YYEOF},
	1179	you can then examine @code{yylval} and @code{yylloc} to determine the
	1180	lookahead token's semantic value and location, if any.
	1181	In a nondeferred semantic action, you can also modify any of these variables to
	1182	influence syntax analysis.
	1183	@xref{Lookahead, ,Lookahead Tokens}.
	1184
	1185	@findex yyclearin
	1186	@cindex GLR parsers and @code{yyclearin}
	1187	In a deferred semantic action, it's too late to influence syntax analysis.
	1188	In this case, @code{yychar}, @code{yylval}, and @code{yylloc} are set to
	1189	shallow copies of the values they had at the time of the associated reduction.
	1190	For this reason alone, modifying them is dangerous.
	1191	Moreover, the result of modifying them is undefined and subject to change with
	1192	future versions of Bison.
	1193	For example, if a semantic action might be deferred, you should never write it
	1194	to invoke @code{yyclearin} (@pxref{Action Features}) or to attempt to free
	1195	memory referenced by @code{yylval}.
	1196
	1197	@subsubsection YYERROR
	1198	@findex YYERROR
	1199	@cindex GLR parsers and @code{YYERROR}
	1200	Another Bison feature requiring special consideration is @code{YYERROR}
	1201	(@pxref{Action Features}), which you can invoke in a semantic action to
	1202	initiate error recovery.
	1203	During deterministic GLR operation, the effect of @code{YYERROR} is
	1204	the same as its effect in a deterministic parser.
	1205	The effect in a deferred action is similar, but the precise point of the
	1206	error is undefined; instead, the parser reverts to deterministic operation,
	1207	selecting an unspecified stack on which to continue with a syntax error.
	1208	In a semantic predicate (see @ref{Semantic Predicates}) during nondeterministic
	1209	parsing, @code{YYERROR} silently prunes
	1210	the parse that invoked the test.
	1211
	1212	@subsubsection Restrictions on semantic values and locations
	1213	GLR parsers require that you use POD (Plain Old Data) types for
	1214	semantic values and location types when using the generated parsers as
	1215	C++ code.
	1216
	1217	@node Semantic Predicates
	1218	@subsection Controlling a Parse with Arbitrary Predicates
	1219	@findex %?
	1220	@cindex Semantic predicates in GLR parsers
	1221
	1222	In addition to the @code{%dprec} and @code{%merge} directives,
	1223	GLR parsers
	1224	allow you to reject parses on the basis of arbitrary computations executed
	1225	in user code, without having Bison treat this rejection as an error
	1226	if there are alternative parses. (This feature is experimental and may
	1227	evolve. We welcome user feedback.) For example,
	1228
	1229	@example
	1230	widget:
	1231	%?@{ new_syntax @} "widget" id new_args @{ $$ = f($3, $4); @}
	1232	\| %?@{ !new_syntax @} "widget" id old_args @{ $$ = f($3, $4); @}
	1233	;
	1234	@end example
	1235
	1236	@noindent
	1237	is one way to allow the same parser to handle two different syntaxes for
	1238	widgets. The clause preceded by @code{%?} is treated like an ordinary
	1239	action, except that its text is treated as an expression and is always
	1240	evaluated immediately (even when in nondeterministic mode). If the
	1241	expression yields 0 (false), the clause is treated as a syntax error,
	1242	which, in a nondeterministic parser, causes the stack in which it is reduced
	1243	to die. In a deterministic parser, it acts like YYERROR.
	1244
	1245	As the example shows, predicates otherwise look like semantic actions, and
	1246	therefore you must be take them into account when determining the numbers
	1247	to use for denoting the semantic values of right-hand side symbols.
	1248	Predicate actions, however, have no defined value, and may not be given
	1249	labels.
	1250
	1251	There is a subtle difference between semantic predicates and ordinary
	1252	actions in nondeterministic mode, since the latter are deferred.
	1253	For example, we could try to rewrite the previous example as
	1254
	1255	@example
	1256	widget:
	1257	@{ if (!new_syntax) YYERROR; @}
	1258	"widget" id new_args @{ $$ = f($3, $4); @}
	1259	\| @{ if (new_syntax) YYERROR; @}
	1260	"widget" id old_args @{ $$ = f($3, $4); @}
	1261	;
	1262	@end example
	1263
	1264	@noindent
	1265	(reversing the sense of the predicate tests to cause an error when they are
	1266	false). However, this
	1267	does @emph{not} have the same effect if @code{new_args} and @code{old_args}
	1268	have overlapping syntax.
	1269	Since the mid-rule actions testing @code{new_syntax} are deferred,
	1270	a GLR parser first encounters the unresolved ambiguous reduction
	1271	for cases where @code{new_args} and @code{old_args} recognize the same string
	1272	@emph{before} performing the tests of @code{new_syntax}. It therefore
	1273	reports an error.
	1274
	1275	Finally, be careful in writing predicates: deferred actions have not been
	1276	evaluated, so that using them in a predicate will have undefined effects.
	1277
	1278	@node Compiler Requirements
	1279	@subsection Considerations when Compiling GLR Parsers
	1280	@cindex @code{inline}
	1281	@cindex GLR parsers and @code{inline}
	1282
	1283	The GLR parsers require a compiler for ISO C89 or
	1284	later. In addition, they use the @code{inline} keyword, which is not
	1285	C89, but is C99 and is a common extension in pre-C99 compilers. It is
	1286	up to the user of these parsers to handle
	1287	portability issues. For instance, if using Autoconf and the Autoconf
	1288	macro @code{AC_C_INLINE}, a mere
	1289
	1290	@example
	1291	%@{
	1292	#include <config.h>
	1293	%@}
	1294	@end example
	1295
	1296	@noindent
	1297	will suffice. Otherwise, we suggest
	1298
	1299	@example
	1300	%@{
	1301	#if (__STDC_VERSION__ < 199901 && ! defined __GNUC__ \
	1302	&& ! defined inline)
	1303	# define inline
	1304	#endif
	1305	%@}
	1306	@end example
	1307
	1308	@node Locations
	1309	@section Locations
	1310	@cindex location
	1311	@cindex textual location
	1312	@cindex location, textual
	1313
	1314	Many applications, like interpreters or compilers, have to produce verbose
	1315	and useful error messages. To achieve this, one must be able to keep track of
	1316	the @dfn{textual location}, or @dfn{location}, of each syntactic construct.
	1317	Bison provides a mechanism for handling these locations.
	1318
	1319	Each token has a semantic value. In a similar fashion, each token has an
	1320	associated location, but the type of locations is the same for all tokens
	1321	and groupings. Moreover, the output parser is equipped with a default data
	1322	structure for storing locations (@pxref{Tracking Locations}, for more
	1323	details).
	1324
	1325	Like semantic values, locations can be reached in actions using a dedicated
	1326	set of constructs. In the example above, the location of the whole grouping
	1327	is @code{@@$}, while the locations of the subexpressions are @code{@@1} and
	1328	@code{@@3}.
	1329
	1330	When a rule is matched, a default action is used to compute the semantic value
	1331	of its left hand side (@pxref{Actions}). In the same way, another default
	1332	action is used for locations. However, the action for locations is general
	1333	enough for most cases, meaning there is usually no need to describe for each
	1334	rule how @code{@@$} should be formed. When building a new location for a given
	1335	grouping, the default behavior of the output parser is to take the beginning
	1336	of the first symbol, and the end of the last symbol.
	1337
	1338	@node Bison Parser
	1339	@section Bison Output: the Parser Implementation File
	1340	@cindex Bison parser
	1341	@cindex Bison utility
	1342	@cindex lexical analyzer, purpose
	1343	@cindex parser
	1344
	1345	When you run Bison, you give it a Bison grammar file as input. The
	1346	most important output is a C source file that implements a parser for
	1347	the language described by the grammar. This parser is called a
	1348	@dfn{Bison parser}, and this file is called a @dfn{Bison parser
	1349	implementation file}. Keep in mind that the Bison utility and the
	1350	Bison parser are two distinct programs: the Bison utility is a program
	1351	whose output is the Bison parser implementation file that becomes part
	1352	of your program.
	1353
	1354	The job of the Bison parser is to group tokens into groupings according to
	1355	the grammar rules---for example, to build identifiers and operators into
	1356	expressions. As it does this, it runs the actions for the grammar rules it
	1357	uses.
	1358
	1359	The tokens come from a function called the @dfn{lexical analyzer} that
	1360	you must supply in some fashion (such as by writing it in C). The Bison
	1361	parser calls the lexical analyzer each time it wants a new token. It
	1362	doesn't know what is ``inside'' the tokens (though their semantic values
	1363	may reflect this). Typically the lexical analyzer makes the tokens by
	1364	parsing characters of text, but Bison does not depend on this.
	1365	@xref{Lexical, ,The Lexical Analyzer Function @code{yylex}}.
	1366
	1367	The Bison parser implementation file is C code which defines a
	1368	function named @code{yyparse} which implements that grammar. This
	1369	function does not make a complete C program: you must supply some
	1370	additional functions. One is the lexical analyzer. Another is an
	1371	error-reporting function which the parser calls to report an error.
	1372	In addition, a complete C program must start with a function called
	1373	@code{main}; you have to provide this, and arrange for it to call
	1374	@code{yyparse} or the parser will never run. @xref{Interface, ,Parser
	1375	C-Language Interface}.
	1376
	1377	Aside from the token type names and the symbols in the actions you
	1378	write, all symbols defined in the Bison parser implementation file
	1379	itself begin with @samp{yy} or @samp{YY}. This includes interface
	1380	functions such as the lexical analyzer function @code{yylex}, the
	1381	error reporting function @code{yyerror} and the parser function
	1382	@code{yyparse} itself. This also includes numerous identifiers used
	1383	for internal purposes. Therefore, you should avoid using C
	1384	identifiers starting with @samp{yy} or @samp{YY} in the Bison grammar
	1385	file except for the ones defined in this manual. Also, you should
	1386	avoid using the C identifiers @samp{malloc} and @samp{free} for
	1387	anything other than their usual meanings.
	1388
	1389	In some cases the Bison parser implementation file includes system
	1390	headers, and in those cases your code should respect the identifiers
	1391	reserved by those headers. On some non-GNU hosts, @code{<alloca.h>},
	1392	@code{<malloc.h>}, @code{<stddef.h>}, and @code{<stdlib.h>} are
	1393	included as needed to declare memory allocators and related types.
	1394	@code{<libintl.h>} is included if message translation is in use
	1395	(@pxref{Internationalization}). Other system headers may be included
	1396	if you define @code{YYDEBUG} to a nonzero value (@pxref{Tracing,
	1397	,Tracing Your Parser}).
	1398
	1399	@node Stages
	1400	@section Stages in Using Bison
	1401	@cindex stages in using Bison
	1402	@cindex using Bison
	1403
	1404	The actual language-design process using Bison, from grammar specification
	1405	to a working compiler or interpreter, has these parts:
	1406
	1407	@enumerate
	1408	@item
	1409	Formally specify the grammar in a form recognized by Bison
	1410	(@pxref{Grammar File, ,Bison Grammar Files}). For each grammatical rule
	1411	in the language, describe the action that is to be taken when an
	1412	instance of that rule is recognized. The action is described by a
	1413	sequence of C statements.
	1414
	1415	@item
	1416	Write a lexical analyzer to process input and pass tokens to the parser.
	1417	The lexical analyzer may be written by hand in C (@pxref{Lexical, ,The
	1418	Lexical Analyzer Function @code{yylex}}). It could also be produced
	1419	using Lex, but the use of Lex is not discussed in this manual.
	1420
	1421	@item
	1422	Write a controlling function that calls the Bison-produced parser.
	1423
	1424	@item
	1425	Write error-reporting routines.
	1426	@end enumerate
	1427
	1428	To turn this source code as written into a runnable program, you
	1429	must follow these steps:
	1430
	1431	@enumerate
	1432	@item
	1433	Run Bison on the grammar to produce the parser.
	1434
	1435	@item
	1436	Compile the code output by Bison, as well as any other source files.
	1437
	1438	@item
	1439	Link the object files to produce the finished product.
	1440	@end enumerate
	1441
	1442	@node Grammar Layout
	1443	@section The Overall Layout of a Bison Grammar
	1444	@cindex grammar file
	1445	@cindex file format
	1446	@cindex format of grammar file
	1447	@cindex layout of Bison grammar
	1448
	1449	The input file for the Bison utility is a @dfn{Bison grammar file}. The
	1450	general form of a Bison grammar file is as follows:
	1451
	1452	@example
	1453	%@{
	1454	@var{Prologue}
	1455	%@}
	1456
	1457	@var{Bison declarations}
	1458
	1459	%%
	1460	@var{Grammar rules}
	1461	%%
	1462	@var{Epilogue}
	1463	@end example
	1464
	1465	@noindent
	1466	The @samp{%%}, @samp{%@{} and @samp{%@}} are punctuation that appears
	1467	in every Bison grammar file to separate the sections.
	1468
	1469	The prologue may define types and variables used in the actions. You can
	1470	also use preprocessor commands to define macros used there, and use
	1471	@code{#include} to include header files that do any of these things.
	1472	You need to declare the lexical analyzer @code{yylex} and the error
	1473	printer @code{yyerror} here, along with any other global identifiers
	1474	used by the actions in the grammar rules.
	1475
	1476	The Bison declarations declare the names of the terminal and nonterminal
	1477	symbols, and may also describe operator precedence and the data types of
	1478	semantic values of various symbols.
	1479
	1480	The grammar rules define how to construct each nonterminal symbol from its
	1481	parts.
	1482
	1483	The epilogue can contain any code you want to use. Often the
	1484	definitions of functions declared in the prologue go here. In a
	1485	simple program, all the rest of the program can go here.
	1486
	1487	@node Examples
	1488	@chapter Examples
	1489	@cindex simple examples
	1490	@cindex examples, simple
	1491
	1492	Now we show and explain several sample programs written using Bison: a
	1493	reverse polish notation calculator, an algebraic (infix) notation
	1494	calculator --- later extended to track ``locations'' ---
	1495	and a multi-function calculator. All
	1496	produce usable, though limited, interactive desk-top calculators.
	1497
	1498	These examples are simple, but Bison grammars for real programming
	1499	languages are written the same way. You can copy these examples into a
	1500	source file to try them.
	1501
	1502	@menu
	1503	* RPN Calc:: Reverse polish notation calculator;
	1504	a first example with no operator precedence.
	1505	* Infix Calc:: Infix (algebraic) notation calculator.
	1506	Operator precedence is introduced.
	1507	* Simple Error Recovery:: Continuing after syntax errors.
	1508	* Location Tracking Calc:: Demonstrating the use of @@@var{n} and @@$.
	1509	* Multi-function Calc:: Calculator with memory and trig functions.
	1510	It uses multiple data-types for semantic values.
	1511	* Exercises:: Ideas for improving the multi-function calculator.
	1512	@end menu
	1513
	1514	@node RPN Calc
	1515	@section Reverse Polish Notation Calculator
	1516	@cindex reverse polish notation
	1517	@cindex polish notation calculator
	1518	@cindex @code{rpcalc}
	1519	@cindex calculator, simple
	1520
	1521	The first example is that of a simple double-precision @dfn{reverse polish
	1522	notation} calculator (a calculator using postfix operators). This example
	1523	provides a good starting point, since operator precedence is not an issue.
	1524	The second example will illustrate how operator precedence is handled.
	1525
	1526	The source code for this calculator is named @file{rpcalc.y}. The
	1527	@samp{.y} extension is a convention used for Bison grammar files.
	1528
	1529	@menu
	1530	* Rpcalc Declarations:: Prologue (declarations) for rpcalc.
	1531	* Rpcalc Rules:: Grammar Rules for rpcalc, with explanation.
	1532	* Rpcalc Lexer:: The lexical analyzer.
	1533	* Rpcalc Main:: The controlling function.
	1534	* Rpcalc Error:: The error reporting function.
	1535	* Rpcalc Generate:: Running Bison on the grammar file.
	1536	* Rpcalc Compile:: Run the C compiler on the output code.
	1537	@end menu
	1538
	1539	@node Rpcalc Declarations
	1540	@subsection Declarations for @code{rpcalc}
	1541
	1542	Here are the C and Bison declarations for the reverse polish notation
	1543	calculator. As in C, comments are placed between @samp{/@dots{}/}.
	1544
	1545	@comment file: rpcalc.y
	1546	@example
	1547	/* Reverse polish notation calculator. */
	1548
	1549	@group
	1550	%@{
	1551	#include <stdio.h>
	1552	#include <math.h>
	1553	int yylex (void);
	1554	void yyerror (char const *);
	1555	%@}
	1556	@end group
	1557
	1558	%define api.value.type @{double@}
	1559	%token NUM
	1560
	1561	%% /* Grammar rules and actions follow. */
	1562	@end example
	1563
	1564	The declarations section (@pxref{Prologue, , The prologue}) contains two
	1565	preprocessor directives and two forward declarations.
	1566
	1567	The @code{#include} directive is used to declare the exponentiation
	1568	function @code{pow}.
	1569
	1570	The forward declarations for @code{yylex} and @code{yyerror} are
	1571	needed because the C language requires that functions be declared
	1572	before they are used. These functions will be defined in the
	1573	epilogue, but the parser calls them so they must be declared in the
	1574	prologue.
	1575
	1576	The second section, Bison declarations, provides information to Bison about
	1577	the tokens and their types (@pxref{Bison Declarations, ,The Bison
	1578	Declarations Section}).
	1579
	1580	The @code{%define} directive defines the variable @code{api.value.type},
	1581	thus specifying the C data type for semantic values of both tokens and
	1582	groupings (@pxref{Value Type, ,Data Types of Semantic Values}). The Bison
	1583	parser will use whatever type @code{api.value.type} is defined as; if you
	1584	don't define it, @code{int} is the default. Because we specify
	1585	@samp{@{double@}}, each token and each expression has an associated value,
	1586	which is a floating point number. C code can use @code{YYSTYPE} to refer to
	1587	the value @code{api.value.type}.
	1588
	1589	Each terminal symbol that is not a single-character literal must be
	1590	declared. (Single-character literals normally don't need to be declared.)
	1591	In this example, all the arithmetic operators are designated by
	1592	single-character literals, so the only terminal symbol that needs to be
	1593	declared is @code{NUM}, the token type for numeric constants.
	1594
	1595	@node Rpcalc Rules
	1596	@subsection Grammar Rules for @code{rpcalc}
	1597
	1598	Here are the grammar rules for the reverse polish notation calculator.
	1599
	1600	@comment file: rpcalc.y
	1601	@example
	1602	@group
	1603	input:
	1604	%empty
	1605	\| input line
	1606	;
	1607	@end group
	1608
	1609	@group
	1610	line:
	1611	'\n'
	1612	\| exp '\n' @{ printf ("%.10g\n", $1); @}
	1613	;
	1614	@end group
	1615
	1616	@group
	1617	exp:
	1618	NUM @{ $$ = $1; @}
	1619	\| exp exp '+' @{ $$ = $1 + $2; @}
	1620	\| exp exp '-' @{ $$ = $1 - $2; @}
	1621	\| exp exp '' @{ $$ = $1 $2; @}
	1622	\| exp exp '/' @{ $$ = $1 / $2; @}
	1623	\| exp exp '^' @{ $$ = pow ($1, $2); @} /* Exponentiation */
	1624	\| exp 'n' @{ $$ = -$1; @} /* Unary minus */
	1625	;
	1626	@end group
	1627	%%
	1628	@end example
	1629
	1630	The groupings of the rpcalc ``language'' defined here are the expression
	1631	(given the name @code{exp}), the line of input (@code{line}), and the
	1632	complete input transcript (@code{input}). Each of these nonterminal
	1633	symbols has several alternate rules, joined by the vertical bar @samp{\|}
	1634	which is read as ``or''. The following sections explain what these rules
	1635	mean.
	1636
	1637	The semantics of the language is determined by the actions taken when a
	1638	grouping is recognized. The actions are the C code that appears inside
	1639	braces. @xref{Actions}.
	1640
	1641	You must specify these actions in C, but Bison provides the means for
	1642	passing semantic values between the rules. In each action, the
	1643	pseudo-variable @code{$$} stands for the semantic value for the grouping
	1644	that the rule is going to construct. Assigning a value to @code{$$} is the
	1645	main job of most actions. The semantic values of the components of the
	1646	rule are referred to as @code{$1}, @code{$2}, and so on.
	1647
	1648	@menu
	1649	* Rpcalc Input:: Explanation of the @code{input} nonterminal
	1650	* Rpcalc Line:: Explanation of the @code{line} nonterminal
	1651	* Rpcalc Expr:: Explanation of the @code{expr} nonterminal
	1652	@end menu
	1653
	1654	@node Rpcalc Input
	1655	@subsubsection Explanation of @code{input}
	1656
	1657	Consider the definition of @code{input}:
	1658
	1659	@example
	1660	input:
	1661	%empty
	1662	\| input line
	1663	;
	1664	@end example
	1665
	1666	This definition reads as follows: ``A complete input is either an empty
	1667	string, or a complete input followed by an input line''. Notice that
	1668	``complete input'' is defined in terms of itself. This definition is said
	1669	to be @dfn{left recursive} since @code{input} appears always as the
	1670	leftmost symbol in the sequence. @xref{Recursion, ,Recursive Rules}.
	1671
	1672	The first alternative is empty because there are no symbols between the
	1673	colon and the first @samp{\|}; this means that @code{input} can match an
	1674	empty string of input (no tokens). We write the rules this way because it
	1675	is legitimate to type @kbd{Ctrl-d} right after you start the calculator.
	1676	It's conventional to put an empty alternative first and to use the
	1677	(optional) @code{%empty} directive, or to write the comment @samp{/* empty
	1678	*/} in it (@pxref{Empty Rules}).
	1679
	1680	The second alternate rule (@code{input line}) handles all nontrivial input.
	1681	It means, ``After reading any number of lines, read one more line if
	1682	possible.'' The left recursion makes this rule into a loop. Since the
	1683	first alternative matches empty input, the loop can be executed zero or
	1684	more times.
	1685
	1686	The parser function @code{yyparse} continues to process input until a
	1687	grammatical error is seen or the lexical analyzer says there are no more
	1688	input tokens; we will arrange for the latter to happen at end-of-input.
	1689
	1690	@node Rpcalc Line
	1691	@subsubsection Explanation of @code{line}
	1692
	1693	Now consider the definition of @code{line}:
	1694
	1695	@example
	1696	line:
	1697	'\n'
	1698	\| exp '\n' @{ printf ("%.10g\n", $1); @}
	1699	;
	1700	@end example
	1701
	1702	The first alternative is a token which is a newline character; this means
	1703	that rpcalc accepts a blank line (and ignores it, since there is no
	1704	action). The second alternative is an expression followed by a newline.
	1705	This is the alternative that makes rpcalc useful. The semantic value of
	1706	the @code{exp} grouping is the value of @code{$1} because the @code{exp} in
	1707	question is the first symbol in the alternative. The action prints this
	1708	value, which is the result of the computation the user asked for.
	1709
	1710	This action is unusual because it does not assign a value to @code{$$}. As
	1711	a consequence, the semantic value associated with the @code{line} is
	1712	uninitialized (its value will be unpredictable). This would be a bug if
	1713	that value were ever used, but we don't use it: once rpcalc has printed the
	1714	value of the user's input line, that value is no longer needed.
	1715
	1716	@node Rpcalc Expr
	1717	@subsubsection Explanation of @code{expr}
	1718
	1719	The @code{exp} grouping has several rules, one for each kind of expression.
	1720	The first rule handles the simplest expressions: those that are just numbers.
	1721	The second handles an addition-expression, which looks like two expressions
	1722	followed by a plus-sign. The third handles subtraction, and so on.
	1723
	1724	@example
	1725	exp:
	1726	NUM
	1727	\| exp exp '+' @{ $$ = $1 + $2; @}
	1728	\| exp exp '-' @{ $$ = $1 - $2; @}
	1729	@dots{}
	1730	;
	1731	@end example
	1732
	1733	We have used @samp{\|} to join all the rules for @code{exp}, but we could
	1734	equally well have written them separately:
	1735
	1736	@example
	1737	exp: NUM ;
	1738	exp: exp exp '+' @{ $$ = $1 + $2; @};
	1739	exp: exp exp '-' @{ $$ = $1 - $2; @};
	1740	@dots{}
	1741	@end example
	1742
	1743	Most of the rules have actions that compute the value of the expression in
	1744	terms of the value of its parts. For example, in the rule for addition,
	1745	@code{$1} refers to the first component @code{exp} and @code{$2} refers to
	1746	the second one. The third component, @code{'+'}, has no meaningful
	1747	associated semantic value, but if it had one you could refer to it as
	1748	@code{$3}. When @code{yyparse} recognizes a sum expression using this
	1749	rule, the sum of the two subexpressions' values is produced as the value of
	1750	the entire expression. @xref{Actions}.
	1751
	1752	You don't have to give an action for every rule. When a rule has no
	1753	action, Bison by default copies the value of @code{$1} into @code{$$}.
	1754	This is what happens in the first rule (the one that uses @code{NUM}).
	1755
	1756	The formatting shown here is the recommended convention, but Bison does
	1757	not require it. You can add or change white space as much as you wish.
	1758	For example, this:
	1759
	1760	@example
	1761	exp: NUM \| exp exp '+' @{$$ = $1 + $2; @} \| @dots{} ;
	1762	@end example
	1763
	1764	@noindent
	1765	means the same thing as this:
	1766
	1767	@example
	1768	exp:
	1769	NUM
	1770	\| exp exp '+' @{ $$ = $1 + $2; @}
	1771	\| @dots{}
	1772	;
	1773	@end example
	1774
	1775	@noindent
	1776	The latter, however, is much more readable.
	1777
	1778	@node Rpcalc Lexer
	1779	@subsection The @code{rpcalc} Lexical Analyzer
	1780	@cindex writing a lexical analyzer
	1781	@cindex lexical analyzer, writing
	1782
	1783	The lexical analyzer's job is low-level parsing: converting characters
	1784	or sequences of characters into tokens. The Bison parser gets its
	1785	tokens by calling the lexical analyzer. @xref{Lexical, ,The Lexical
	1786	Analyzer Function @code{yylex}}.
	1787
	1788	Only a simple lexical analyzer is needed for the RPN
	1789	calculator. This
	1790	lexical analyzer skips blanks and tabs, then reads in numbers as
	1791	@code{double} and returns them as @code{NUM} tokens. Any other character
	1792	that isn't part of a number is a separate token. Note that the token-code
	1793	for such a single-character token is the character itself.
	1794
	1795	The return value of the lexical analyzer function is a numeric code which
	1796	represents a token type. The same text used in Bison rules to stand for
	1797	this token type is also a C expression for the numeric code for the type.
	1798	This works in two ways. If the token type is a character literal, then its
	1799	numeric code is that of the character; you can use the same
	1800	character literal in the lexical analyzer to express the number. If the
	1801	token type is an identifier, that identifier is defined by Bison as a C
	1802	macro whose definition is the appropriate number. In this example,
	1803	therefore, @code{NUM} becomes a macro for @code{yylex} to use.
	1804
	1805	The semantic value of the token (if it has one) is stored into the
	1806	global variable @code{yylval}, which is where the Bison parser will look
	1807	for it. (The C data type of @code{yylval} is @code{YYSTYPE}, whose value
	1808	was defined at the beginning of the grammar via @samp{%define api.value.type
	1809	@{double@}}; @pxref{Rpcalc Declarations,,Declarations for @code{rpcalc}}.)
	1810
	1811	A token type code of zero is returned if the end-of-input is encountered.
	1812	(Bison recognizes any nonpositive value as indicating end-of-input.)
	1813
	1814	Here is the code for the lexical analyzer:
	1815
	1816	@comment file: rpcalc.y
	1817	@example
	1818	@group
	1819	/* The lexical analyzer returns a double floating point
	1820	number on the stack and the token NUM, or the numeric code
	1821	of the character read if not a number. It skips all blanks
	1822	and tabs, and returns 0 for end-of-input. */
	1823
	1824	#include <ctype.h>
	1825	@end group
	1826
	1827	@group
	1828	int
	1829	yylex (void)
	1830	@{
	1831	int c;
	1832
	1833	/* Skip white space. */
	1834	while ((c = getchar ()) == ' ' \|\| c == '\t')
	1835	continue;
	1836	@end group
	1837	@group
	1838	/* Process numbers. */
	1839	if (c == '.' \|\| isdigit (c))
	1840	@{
	1841	ungetc (c, stdin);
	1842	scanf ("%lf", &yylval);
	1843	return NUM;
	1844	@}
	1845	@end group
	1846	@group
	1847	/* Return end-of-input. */
	1848	if (c == EOF)
	1849	return 0;
	1850	/* Return a single char. */
	1851	return c;
	1852	@}
	1853	@end group
	1854	@end example
	1855
	1856	@node Rpcalc Main
	1857	@subsection The Controlling Function
	1858	@cindex controlling function
	1859	@cindex main function in simple example
	1860
	1861	In keeping with the spirit of this example, the controlling function is
	1862	kept to the bare minimum. The only requirement is that it call
	1863	@code{yyparse} to start the process of parsing.
	1864
	1865	@comment file: rpcalc.y
	1866	@example
	1867	@group
	1868	int
	1869	main (void)
	1870	@{
	1871	return yyparse ();
	1872	@}
	1873	@end group
	1874	@end example
	1875
	1876	@node Rpcalc Error
	1877	@subsection The Error Reporting Routine
	1878	@cindex error reporting routine
	1879
	1880	When @code{yyparse} detects a syntax error, it calls the error reporting
	1881	function @code{yyerror} to print an error message (usually but not
	1882	always @code{"syntax error"}). It is up to the programmer to supply
	1883	@code{yyerror} (@pxref{Interface, ,Parser C-Language Interface}), so
	1884	here is the definition we will use:
	1885
	1886	@comment file: rpcalc.y
	1887	@example
	1888	#include <stdio.h>
	1889
	1890	@group
	1891	/* Called by yyparse on error. */
	1892	void
	1893	yyerror (char const *s)
	1894	@{
	1895	fprintf (stderr, "%s\n", s);
	1896	@}
	1897	@end group
	1898	@end example
	1899
	1900	After @code{yyerror} returns, the Bison parser may recover from the error
	1901	and continue parsing if the grammar contains a suitable error rule
	1902	(@pxref{Error Recovery}). Otherwise, @code{yyparse} returns nonzero. We
	1903	have not written any error rules in this example, so any invalid input will
	1904	cause the calculator program to exit. This is not clean behavior for a
	1905	real calculator, but it is adequate for the first example.
	1906
	1907	@node Rpcalc Generate
	1908	@subsection Running Bison to Make the Parser
	1909	@cindex running Bison (introduction)
	1910
	1911	Before running Bison to produce a parser, we need to decide how to
	1912	arrange all the source code in one or more source files. For such a
	1913	simple example, the easiest thing is to put everything in one file,
	1914	the grammar file. The definitions of @code{yylex}, @code{yyerror} and
	1915	@code{main} go at the end, in the epilogue of the grammar file
	1916	(@pxref{Grammar Layout, ,The Overall Layout of a Bison Grammar}).
	1917
	1918	For a large project, you would probably have several source files, and use
	1919	@code{make} to arrange to recompile them.
	1920
	1921	With all the source in the grammar file, you use the following command
	1922	to convert it into a parser implementation file:
	1923
	1924	@example
	1925	bison @var{file}.y
	1926	@end example
	1927
	1928	@noindent
	1929	In this example, the grammar file is called @file{rpcalc.y} (for
	1930	``Reverse Polish @sc{calc}ulator''). Bison produces a parser
	1931	implementation file named @file{@var{file}.tab.c}, removing the
	1932	@samp{.y} from the grammar file name. The parser implementation file
	1933	contains the source code for @code{yyparse}. The additional functions
	1934	in the grammar file (@code{yylex}, @code{yyerror} and @code{main}) are
	1935	copied verbatim to the parser implementation file.
	1936
	1937	@node Rpcalc Compile
	1938	@subsection Compiling the Parser Implementation File
	1939	@cindex compiling the parser
	1940
	1941	Here is how to compile and run the parser implementation file:
	1942
	1943	@example
	1944	@group
	1945	# @r{List files in current directory.}
	1946	$ @kbd{ls}
	1947	rpcalc.tab.c rpcalc.y
	1948	@end group
	1949
	1950	@group
	1951	# @r{Compile the Bison parser.}
	1952	# @r{@samp{-lm} tells compiler to search math library for @code{pow}.}
	1953	$ @kbd{cc -lm -o rpcalc rpcalc.tab.c}
	1954	@end group
	1955
	1956	@group
	1957	# @r{List files again.}
	1958	$ @kbd{ls}
	1959	rpcalc rpcalc.tab.c rpcalc.y
	1960	@end group
	1961	@end example
	1962
	1963	The file @file{rpcalc} now contains the executable code. Here is an
	1964	example session using @code{rpcalc}.
	1965
	1966	@example
	1967	$ @kbd{rpcalc}
	1968	@kbd{4 9 +}
	1969	@result{} 13
	1970	@kbd{3 7 + 3 4 5 *+-}
	1971	@result{} -13
	1972	@kbd{3 7 + 3 4 5 * + - n} @r{Note the unary minus, @samp{n}}
	1973	@result{} 13
	1974	@kbd{5 6 / 4 n +}
	1975	@result{} -3.166666667
	1976	@kbd{3 4 ^} @r{Exponentiation}
	1977	@result{} 81
	1978	@kbd{^D} @r{End-of-file indicator}
	1979	$
	1980	@end example
	1981
	1982	@node Infix Calc
	1983	@section Infix Notation Calculator: @code{calc}
	1984	@cindex infix notation calculator
	1985	@cindex @code{calc}
	1986	@cindex calculator, infix notation
	1987
	1988	We now modify rpcalc to handle infix operators instead of postfix. Infix
	1989	notation involves the concept of operator precedence and the need for
	1990	parentheses nested to arbitrary depth. Here is the Bison code for
	1991	@file{calc.y}, an infix desk-top calculator.
	1992
	1993	@example
	1994	/* Infix notation calculator. */
	1995
	1996	@group
	1997	%@{
	1998	#include <math.h>
	1999	#include <stdio.h>
	2000	int yylex (void);
	2001	void yyerror (char const *);
	2002	%@}
	2003	@end group
	2004
	2005	@group
	2006	/* Bison declarations. */
	2007	%define api.value.type @{double@}
	2008	%token NUM
	2009	%left '-' '+'
	2010	%left '*' '/'
	2011	%precedence NEG /* negation--unary minus */
	2012	%right '^' /* exponentiation */
	2013	@end group
	2014
	2015	%% /* The grammar follows. */
	2016	@group
	2017	input:
	2018	%empty
	2019	\| input line
	2020	;
	2021	@end group
	2022
	2023	@group
	2024	line:
	2025	'\n'
	2026	\| exp '\n' @{ printf ("\t%.10g\n", $1); @}
	2027	;
	2028	@end group
	2029
	2030	@group
	2031	exp:
	2032	NUM @{ $$ = $1; @}
	2033	\| exp '+' exp @{ $$ = $1 + $3; @}
	2034	\| exp '-' exp @{ $$ = $1 - $3; @}
	2035	\| exp '' exp @{ $$ = $1 $3; @}
	2036	\| exp '/' exp @{ $$ = $1 / $3; @}
	2037	\| '-' exp %prec NEG @{ $$ = -$2; @}
	2038	\| exp '^' exp @{ $$ = pow ($1, $3); @}
	2039	\| '(' exp ')' @{ $$ = $2; @}
	2040	;
	2041	@end group
	2042	%%
	2043	@end example
	2044
	2045	@noindent
	2046	The functions @code{yylex}, @code{yyerror} and @code{main} can be the
	2047	same as before.
	2048
	2049	There are two important new features shown in this code.
	2050
	2051	In the second section (Bison declarations), @code{%left} declares token
	2052	types and says they are left-associative operators. The declarations
	2053	@code{%left} and @code{%right} (right associativity) take the place of
	2054	@code{%token} which is used to declare a token type name without
	2055	associativity/precedence. (These tokens are single-character literals, which
	2056	ordinarily don't need to be declared. We declare them here to specify
	2057	the associativity/precedence.)
	2058
	2059	Operator precedence is determined by the line ordering of the
	2060	declarations; the higher the line number of the declaration (lower on
	2061	the page or screen), the higher the precedence. Hence, exponentiation
	2062	has the highest precedence, unary minus (@code{NEG}) is next, followed
	2063	by @samp{*} and @samp{/}, and so on. Unary minus is not associative,
	2064	only precedence matters (@code{%precedence}. @xref{Precedence, ,Operator
	2065	Precedence}.
	2066
	2067	The other important new feature is the @code{%prec} in the grammar
	2068	section for the unary minus operator. The @code{%prec} simply instructs
	2069	Bison that the rule @samp{\| '-' exp} has the same precedence as
	2070	@code{NEG}---in this case the next-to-highest. @xref{Contextual
	2071	Precedence, ,Context-Dependent Precedence}.
	2072
	2073	Here is a sample run of @file{calc.y}:
	2074
	2075	@need 500
	2076	@example
	2077	$ @kbd{calc}
	2078	@kbd{4 + 4.5 - (34/(8*3+-3))}
	2079	6.880952381
	2080	@kbd{-56 + 2}
	2081	-54
	2082	@kbd{3 ^ 2}
	2083	9
	2084	@end example
	2085
	2086	@node Simple Error Recovery
	2087	@section Simple Error Recovery
	2088	@cindex error recovery, simple
	2089
	2090	Up to this point, this manual has not addressed the issue of @dfn{error
	2091	recovery}---how to continue parsing after the parser detects a syntax
	2092	error. All we have handled is error reporting with @code{yyerror}.
	2093	Recall that by default @code{yyparse} returns after calling
	2094	@code{yyerror}. This means that an erroneous input line causes the
	2095	calculator program to exit. Now we show how to rectify this deficiency.
	2096
	2097	The Bison language itself includes the reserved word @code{error}, which
	2098	may be included in the grammar rules. In the example below it has
	2099	been added to one of the alternatives for @code{line}:
	2100
	2101	@example
	2102	@group
	2103	line:
	2104	'\n'
	2105	\| exp '\n' @{ printf ("\t%.10g\n", $1); @}
	2106	\| error '\n' @{ yyerrok; @}
	2107	;
	2108	@end group
	2109	@end example
	2110
	2111	This addition to the grammar allows for simple error recovery in the
	2112	event of a syntax error. If an expression that cannot be evaluated is
	2113	read, the error will be recognized by the third rule for @code{line},
	2114	and parsing will continue. (The @code{yyerror} function is still called
	2115	upon to print its message as well.) The action executes the statement
	2116	@code{yyerrok}, a macro defined automatically by Bison; its meaning is
	2117	that error recovery is complete (@pxref{Error Recovery}). Note the
	2118	difference between @code{yyerrok} and @code{yyerror}; neither one is a
	2119	misprint.
	2120
	2121	This form of error recovery deals with syntax errors. There are other
	2122	kinds of errors; for example, division by zero, which raises an exception
	2123	signal that is normally fatal. A real calculator program must handle this
	2124	signal and use @code{longjmp} to return to @code{main} and resume parsing
	2125	input lines; it would also have to discard the rest of the current line of
	2126	input. We won't discuss this issue further because it is not specific to
	2127	Bison programs.
	2128
	2129	@node Location Tracking Calc
	2130	@section Location Tracking Calculator: @code{ltcalc}
	2131	@cindex location tracking calculator
	2132	@cindex @code{ltcalc}
	2133	@cindex calculator, location tracking
	2134
	2135	This example extends the infix notation calculator with location
	2136	tracking. This feature will be used to improve the error messages. For
	2137	the sake of clarity, this example is a simple integer calculator, since
	2138	most of the work needed to use locations will be done in the lexical
	2139	analyzer.
	2140
	2141	@menu
	2142	* Ltcalc Declarations:: Bison and C declarations for ltcalc.
	2143	* Ltcalc Rules:: Grammar rules for ltcalc, with explanations.
	2144	* Ltcalc Lexer:: The lexical analyzer.
	2145	@end menu
	2146
	2147	@node Ltcalc Declarations
	2148	@subsection Declarations for @code{ltcalc}
	2149
	2150	The C and Bison declarations for the location tracking calculator are
	2151	the same as the declarations for the infix notation calculator.
	2152
	2153	@example
	2154	/* Location tracking calculator. */
	2155
	2156	%@{
	2157	#include <math.h>
	2158	int yylex (void);
	2159	void yyerror (char const *);
	2160	%@}
	2161
	2162	/* Bison declarations. */
	2163	%define api.value.type @{int@}
	2164	%token NUM
	2165
	2166	%left '-' '+'
	2167	%left '*' '/'
	2168	%precedence NEG
	2169	%right '^'
	2170
	2171	%% /* The grammar follows. */
	2172	@end example
	2173
	2174	@noindent
	2175	Note there are no declarations specific to locations. Defining a data
	2176	type for storing locations is not needed: we will use the type provided
	2177	by default (@pxref{Location Type, ,Data Types of Locations}), which is a
	2178	four member structure with the following integer fields:
	2179	@code{first_line}, @code{first_column}, @code{last_line} and
	2180	@code{last_column}. By conventions, and in accordance with the GNU
	2181	Coding Standards and common practice, the line and column count both
	2182	start at 1.
	2183
	2184	@node Ltcalc Rules
	2185	@subsection Grammar Rules for @code{ltcalc}
	2186
	2187	Whether handling locations or not has no effect on the syntax of your
	2188	language. Therefore, grammar rules for this example will be very close
	2189	to those of the previous example: we will only modify them to benefit
	2190	from the new information.
	2191
	2192	Here, we will use locations to report divisions by zero, and locate the
	2193	wrong expressions or subexpressions.
	2194
	2195	@example
	2196	@group
	2197	input:
	2198	%empty
	2199	\| input line
	2200	;
	2201	@end group
	2202
	2203	@group
	2204	line:
	2205	'\n'
	2206	\| exp '\n' @{ printf ("%d\n", $1); @}
	2207	;
	2208	@end group
	2209
	2210	@group
	2211	exp:
	2212	NUM @{ $$ = $1; @}
	2213	\| exp '+' exp @{ $$ = $1 + $3; @}
	2214	\| exp '-' exp @{ $$ = $1 - $3; @}
	2215	\| exp '' exp @{ $$ = $1 $3; @}
	2216	@end group
	2217	@group
	2218	\| exp '/' exp
	2219	@{
	2220	if ($3)
	2221	$$ = $1 / $3;
	2222	else
	2223	@{
	2224	$$ = 1;
	2225	fprintf (stderr, "%d.%d-%d.%d: division by zero",
	2226	@@3.first_line, @@3.first_column,
	2227	@@3.last_line, @@3.last_column);
	2228	@}
	2229	@}
	2230	@end group
	2231	@group
	2232	\| '-' exp %prec NEG @{ $$ = -$2; @}
	2233	\| exp '^' exp @{ $$ = pow ($1, $3); @}
	2234	\| '(' exp ')' @{ $$ = $2; @}
	2235	@end group
	2236	@end example
	2237
	2238	This code shows how to reach locations inside of semantic actions, by
	2239	using the pseudo-variables @code{@@@var{n}} for rule components, and the
	2240	pseudo-variable @code{@@$} for groupings.
	2241
	2242	We don't need to assign a value to @code{@@$}: the output parser does it
	2243	automatically. By default, before executing the C code of each action,
	2244	@code{@@$} is set to range from the beginning of @code{@@1} to the end
	2245	of @code{@@@var{n}}, for a rule with @var{n} components. This behavior
	2246	can be redefined (@pxref{Location Default Action, , Default Action for
	2247	Locations}), and for very specific rules, @code{@@$} can be computed by
	2248	hand.
	2249
	2250	@node Ltcalc Lexer
	2251	@subsection The @code{ltcalc} Lexical Analyzer.
	2252
	2253	Until now, we relied on Bison's defaults to enable location
	2254	tracking. The next step is to rewrite the lexical analyzer, and make it
	2255	able to feed the parser with the token locations, as it already does for
	2256	semantic values.
	2257
	2258	To this end, we must take into account every single character of the
	2259	input text, to avoid the computed locations of being fuzzy or wrong:
	2260
	2261	@example
	2262	@group
	2263	int
	2264	yylex (void)
	2265	@{
	2266	int c;
	2267	@end group
	2268
	2269	@group
	2270	/* Skip white space. */
	2271	while ((c = getchar ()) == ' ' \|\| c == '\t')
	2272	++yylloc.last_column;
	2273	@end group
	2274
	2275	@group
	2276	/* Step. */
	2277	yylloc.first_line = yylloc.last_line;
	2278	yylloc.first_column = yylloc.last_column;
	2279	@end group
	2280
	2281	@group
	2282	/* Process numbers. */
	2283	if (isdigit (c))
	2284	@{
	2285	yylval = c - '0';
	2286	++yylloc.last_column;
	2287	while (isdigit (c = getchar ()))
	2288	@{
	2289	++yylloc.last_column;
	2290	yylval = yylval * 10 + c - '0';
	2291	@}
	2292	ungetc (c, stdin);
	2293	return NUM;
	2294	@}
	2295	@end group
	2296
	2297	/* Return end-of-input. */
	2298	if (c == EOF)
	2299	return 0;
	2300
	2301	@group
	2302	/* Return a single char, and update location. */
	2303	if (c == '\n')
	2304	@{
	2305	++yylloc.last_line;
	2306	yylloc.last_column = 0;
	2307	@}
	2308	else
	2309	++yylloc.last_column;
	2310	return c;
	2311	@}
	2312	@end group
	2313	@end example
	2314
	2315	Basically, the lexical analyzer performs the same processing as before:
	2316	it skips blanks and tabs, and reads numbers or single-character tokens.
	2317	In addition, it updates @code{yylloc}, the global variable (of type
	2318	@code{YYLTYPE}) containing the token's location.
	2319
	2320	Now, each time this function returns a token, the parser has its number
	2321	as well as its semantic value, and its location in the text. The last
	2322	needed change is to initialize @code{yylloc}, for example in the
	2323	controlling function:
	2324
	2325	@example
	2326	@group
	2327	int
	2328	main (void)
	2329	@{
	2330	yylloc.first_line = yylloc.last_line = 1;
	2331	yylloc.first_column = yylloc.last_column = 0;
	2332	return yyparse ();
	2333	@}
	2334	@end group
	2335	@end example
	2336
	2337	Remember that computing locations is not a matter of syntax. Every
	2338	character must be associated to a location update, whether it is in
	2339	valid input, in comments, in literal strings, and so on.
	2340
	2341	@node Multi-function Calc
	2342	@section Multi-Function Calculator: @code{mfcalc}
	2343	@cindex multi-function calculator
	2344	@cindex @code{mfcalc}
	2345	@cindex calculator, multi-function
	2346
	2347	Now that the basics of Bison have been discussed, it is time to move on to
	2348	a more advanced problem. The above calculators provided only five
	2349	functions, @samp{+}, @samp{-}, @samp{*}, @samp{/} and @samp{^}. It would
	2350	be nice to have a calculator that provides other mathematical functions such
	2351	as @code{sin}, @code{cos}, etc.
	2352
	2353	It is easy to add new operators to the infix calculator as long as they are
	2354	only single-character literals. The lexical analyzer @code{yylex} passes
	2355	back all nonnumeric characters as tokens, so new grammar rules suffice for
	2356	adding a new operator. But we want something more flexible: built-in
	2357	functions whose syntax has this form:
	2358
	2359	@example
	2360	@var{function_name} (@var{argument})
	2361	@end example
	2362
	2363	@noindent
	2364	At the same time, we will add memory to the calculator, by allowing you
	2365	to create named variables, store values in them, and use them later.
	2366	Here is a sample session with the multi-function calculator:
	2367
	2368	@example
	2369	@group
	2370	$ @kbd{mfcalc}
	2371	@kbd{pi = 3.141592653589}
	2372	@result{} 3.1415926536
	2373	@end group
	2374	@group
	2375	@kbd{sin(pi)}
	2376	@result{} 0.0000000000
	2377	@end group
	2378	@kbd{alpha = beta1 = 2.3}
	2379	@result{} 2.3000000000
	2380	@kbd{alpha}
	2381	@result{} 2.3000000000
	2382	@kbd{ln(alpha)}
	2383	@result{} 0.8329091229
	2384	@kbd{exp(ln(beta1))}
	2385	@result{} 2.3000000000
	2386	$
	2387	@end example
	2388
	2389	Note that multiple assignment and nested function calls are permitted.
	2390
	2391	@menu
	2392	* Mfcalc Declarations:: Bison declarations for multi-function calculator.
	2393	* Mfcalc Rules:: Grammar rules for the calculator.
	2394	* Mfcalc Symbol Table:: Symbol table management subroutines.
	2395	* Mfcalc Lexer:: The lexical analyzer.
	2396	* Mfcalc Main:: The controlling function.
	2397	@end menu
	2398
	2399	@node Mfcalc Declarations
	2400	@subsection Declarations for @code{mfcalc}
	2401
	2402	Here are the C and Bison declarations for the multi-function calculator.
	2403
	2404	@comment file: mfcalc.y: 1
	2405	@example
	2406	@group
	2407	%@{
	2408	#include <stdio.h> /* For printf, etc. */
	2409	#include <math.h> /* For pow, used in the grammar. */
	2410	#include "calc.h" /* Contains definition of 'symrec'. */
	2411	int yylex (void);
	2412	void yyerror (char const *);
	2413	%@}
	2414	@end group
	2415
	2416	%define api.value.type union /* Generate YYSTYPE from these types: */
	2417	%token <double> NUM /* Simple double precision number. */
	2418	%token <symrec> VAR FNCT / Symbol table pointer: variable and function. */
	2419	%type <double> exp
	2420
	2421	@group
	2422	%precedence '='
	2423	%left '-' '+'
	2424	%left '*' '/'
	2425	%precedence NEG /* negation--unary minus */
	2426	%right '^' /* exponentiation */
	2427	@end group
	2428	@end example
	2429
	2430	The above grammar introduces only two new features of the Bison language.
	2431	These features allow semantic values to have various data types
	2432	(@pxref{Multiple Types, ,More Than One Value Type}).
	2433
	2434	The special @code{union} value assigned to the @code{%define} variable
	2435	@code{api.value.type} specifies that the symbols are defined with their data
	2436	types. Bison will generate an appropriate definition of @code{YYSTYPE} to
	2437	store these values.
	2438
	2439	Since values can now have various types, it is necessary to associate a type
	2440	with each grammar symbol whose semantic value is used. These symbols are
	2441	@code{NUM}, @code{VAR}, @code{FNCT}, and @code{exp}. Their declarations are
	2442	augmented with their data type (placed between angle brackets). For
	2443	instance, values of @code{NUM} are stored in @code{double}.
	2444
	2445	The Bison construct @code{%type} is used for declaring nonterminal symbols,
	2446	just as @code{%token} is used for declaring token types. Previously we did
	2447	not use @code{%type} before because nonterminal symbols are normally
	2448	declared implicitly by the rules that define them. But @code{exp} must be
	2449	declared explicitly so we can specify its value type. @xref{Type Decl,
	2450	,Nonterminal Symbols}.
	2451
	2452	@node Mfcalc Rules
	2453	@subsection Grammar Rules for @code{mfcalc}
	2454
	2455	Here are the grammar rules for the multi-function calculator.
	2456	Most of them are copied directly from @code{calc}; three rules,
	2457	those which mention @code{VAR} or @code{FNCT}, are new.
	2458
	2459	@comment file: mfcalc.y: 3
	2460	@example
	2461	%% /* The grammar follows. */
	2462	@group
	2463	input:
	2464	%empty
	2465	\| input line
	2466	;
	2467	@end group
	2468
	2469	@group
	2470	line:
	2471	'\n'
	2472	\| exp '\n' @{ printf ("%.10g\n", $1); @}
	2473	\| error '\n' @{ yyerrok; @}
	2474	;
	2475	@end group
	2476
	2477	@group
	2478	exp:
	2479	NUM @{ $$ = $1; @}
	2480	\| VAR @{ $$ = $1->value.var; @}
	2481	\| VAR '=' exp @{ $$ = $3; $1->value.var = $3; @}
	2482	\| FNCT '(' exp ')' @{ $$ = (*($1->value.fnctptr))($3); @}
	2483	\| exp '+' exp @{ $$ = $1 + $3; @}
	2484	\| exp '-' exp @{ $$ = $1 - $3; @}
	2485	\| exp '' exp @{ $$ = $1 $3; @}
	2486	\| exp '/' exp @{ $$ = $1 / $3; @}
	2487	\| '-' exp %prec NEG @{ $$ = -$2; @}
	2488	\| exp '^' exp @{ $$ = pow ($1, $3); @}
	2489	\| '(' exp ')' @{ $$ = $2; @}
	2490	;
	2491	@end group
	2492	/* End of grammar. */
	2493	%%
	2494	@end example
	2495
	2496	@node Mfcalc Symbol Table
	2497	@subsection The @code{mfcalc} Symbol Table
	2498	@cindex symbol table example
	2499
	2500	The multi-function calculator requires a symbol table to keep track of the
	2501	names and meanings of variables and functions. This doesn't affect the
	2502	grammar rules (except for the actions) or the Bison declarations, but it
	2503	requires some additional C functions for support.
	2504
	2505	The symbol table itself consists of a linked list of records. Its
	2506	definition, which is kept in the header @file{calc.h}, is as follows. It
	2507	provides for either functions or variables to be placed in the table.
	2508
	2509	@comment file: calc.h
	2510	@example
	2511	@group
	2512	/* Function type. */
	2513	typedef double (*func_t) (double);
	2514	@end group
	2515
	2516	@group
	2517	/* Data type for links in the chain of symbols. */
	2518	struct symrec
	2519	@{
	2520	char name; / name of symbol */
	2521	int type; /* type of symbol: either VAR or FNCT */
	2522	union
	2523	@{
	2524	double var; /* value of a VAR */
	2525	func_t fnctptr; /* value of a FNCT */
	2526	@} value;
	2527	struct symrec next; / link field */
	2528	@};
	2529	@end group
	2530
	2531	@group
	2532	typedef struct symrec symrec;
	2533
	2534	/* The symbol table: a chain of 'struct symrec'. */
	2535	extern symrec *sym_table;
	2536
	2537	symrec putsym (char const , int);
	2538	symrec getsym (char const );
	2539	@end group
	2540	@end example
	2541
	2542	The new version of @code{main} will call @code{init_table} to initialize
	2543	the symbol table:
	2544
	2545	@comment file: mfcalc.y: 3
	2546	@example
	2547	@group
	2548	struct init
	2549	@{
	2550	char const *fname;
	2551	double (*fnct) (double);
	2552	@};
	2553	@end group
	2554
	2555	@group
	2556	struct init const arith_fncts[] =
	2557	@{
	2558	@{ "atan", atan @},
	2559	@{ "cos", cos @},
	2560	@{ "exp", exp @},
	2561	@{ "ln", log @},
	2562	@{ "sin", sin @},
	2563	@{ "sqrt", sqrt @},
	2564	@{ 0, 0 @},
	2565	@};
	2566	@end group
	2567
	2568	@group
	2569	/* The symbol table: a chain of 'struct symrec'. */
	2570	symrec *sym_table;
	2571	@end group
	2572
	2573	@group
	2574	/* Put arithmetic functions in table. */
	2575	static
	2576	void
	2577	init_table (void)
	2578	@{
	2579	int i;
	2580	for (i = 0; arith_fncts[i].fname != 0; i++)
	2581	@{
	2582	symrec *ptr = putsym (arith_fncts[i].fname, FNCT);
	2583	ptr->value.fnctptr = arith_fncts[i].fnct;
	2584	@}
	2585	@}
	2586	@end group
	2587	@end example
	2588
	2589	By simply editing the initialization list and adding the necessary include
	2590	files, you can add additional functions to the calculator.
	2591
	2592	Two important functions allow look-up and installation of symbols in the
	2593	symbol table. The function @code{putsym} is passed a name and the type
	2594	(@code{VAR} or @code{FNCT}) of the object to be installed. The object is
	2595	linked to the front of the list, and a pointer to the object is returned.
	2596	The function @code{getsym} is passed the name of the symbol to look up. If
	2597	found, a pointer to that symbol is returned; otherwise zero is returned.
	2598
	2599	@comment file: mfcalc.y: 3
	2600	@example
	2601	#include <stdlib.h> /* malloc. */
	2602	#include <string.h> /* strlen. */
	2603
	2604	@group
	2605	symrec *
	2606	putsym (char const *sym_name, int sym_type)
	2607	@{
	2608	symrec ptr = (symrec ) malloc (sizeof (symrec));
	2609	ptr->name = (char *) malloc (strlen (sym_name) + 1);
	2610	strcpy (ptr->name,sym_name);
	2611	ptr->type = sym_type;
	2612	ptr->value.var = 0; /* Set value to 0 even if fctn. */
	2613	ptr->next = (struct symrec *)sym_table;
	2614	sym_table = ptr;
	2615	return ptr;
	2616	@}
	2617	@end group
	2618
	2619	@group
	2620	symrec *
	2621	getsym (char const *sym_name)
	2622	@{
	2623	symrec *ptr;
	2624	for (ptr = sym_table; ptr != (symrec *) 0;
	2625	ptr = (symrec *)ptr->next)
	2626	if (strcmp (ptr->name, sym_name) == 0)
	2627	return ptr;
	2628	return 0;
	2629	@}
	2630	@end group
	2631	@end example
	2632
	2633	@node Mfcalc Lexer
	2634	@subsection The @code{mfcalc} Lexer
	2635
	2636	The function @code{yylex} must now recognize variables, numeric values, and
	2637	the single-character arithmetic operators. Strings of alphanumeric
	2638	characters with a leading letter are recognized as either variables or
	2639	functions depending on what the symbol table says about them.
	2640
	2641	The string is passed to @code{getsym} for look up in the symbol table. If
	2642	the name appears in the table, a pointer to its location and its type
	2643	(@code{VAR} or @code{FNCT}) is returned to @code{yyparse}. If it is not
	2644	already in the table, then it is installed as a @code{VAR} using
	2645	@code{putsym}. Again, a pointer and its type (which must be @code{VAR}) is
	2646	returned to @code{yyparse}.
	2647
	2648	No change is needed in the handling of numeric values and arithmetic
	2649	operators in @code{yylex}.
	2650
	2651	@comment file: mfcalc.y: 3
	2652	@example
	2653	#include <ctype.h>
	2654
	2655	@group
	2656	int
	2657	yylex (void)
	2658	@{
	2659	int c;
	2660
	2661	/* Ignore white space, get first nonwhite character. */
	2662	while ((c = getchar ()) == ' ' \|\| c == '\t')
	2663	continue;
	2664
	2665	if (c == EOF)
	2666	return 0;
	2667	@end group
	2668
	2669	@group
	2670	/* Char starts a number => parse the number. */
	2671	if (c == '.' \|\| isdigit (c))
	2672	@{
	2673	ungetc (c, stdin);
	2674	scanf ("%lf", &yylval.NUM);
	2675	return NUM;
	2676	@}
	2677	@end group
	2678	@end example
	2679
	2680	@noindent
	2681	Bison generated a definition of @code{YYSTYPE} with a member named
	2682	@code{NUM} to store value of @code{NUM} symbols.
	2683
	2684	@comment file: mfcalc.y: 3
	2685	@example
	2686	@group
	2687	/* Char starts an identifier => read the name. */
	2688	if (isalpha (c))
	2689	@{
	2690	/* Initially make the buffer long enough
	2691	for a 40-character symbol name. */
	2692	static size_t length = 40;
	2693	static char *symbuf = 0;
	2694	symrec *s;
	2695	int i;
	2696	@end group
	2697	if (!symbuf)
	2698	symbuf = (char *) malloc (length + 1);
	2699
	2700	i = 0;
	2701	do
	2702	@group
	2703	@{
	2704	/* If buffer is full, make it bigger. */
	2705	if (i == length)
	2706	@{
	2707	length *= 2;
	2708	symbuf = (char *) realloc (symbuf, length + 1);
	2709	@}
	2710	/* Add this character to the buffer. */
	2711	symbuf[i++] = c;
	2712	/* Get another character. */
	2713	c = getchar ();
	2714	@}
	2715	@end group
	2716	@group
	2717	while (isalnum (c));
	2718
	2719	ungetc (c, stdin);
	2720	symbuf[i] = '\0';
	2721	@end group
	2722
	2723	@group
	2724	s = getsym (symbuf);
	2725	if (s == 0)
	2726	s = putsym (symbuf, VAR);
	2727	((symrec*) &yylval) = s;
	2728	return s->type;
	2729	@}
	2730
	2731	/* Any other character is a token by itself. */
	2732	return c;
	2733	@}
	2734	@end group
	2735	@end example
	2736
	2737	@node Mfcalc Main
	2738	@subsection The @code{mfcalc} Main
	2739
	2740	The error reporting function is unchanged, and the new version of
	2741	@code{main} includes a call to @code{init_table} and sets the @code{yydebug}
	2742	on user demand (@xref{Tracing, , Tracing Your Parser}, for details):
	2743
	2744	@comment file: mfcalc.y: 3
	2745	@example
	2746	@group
	2747	/* Called by yyparse on error. */
	2748	void
	2749	yyerror (char const *s)
	2750	@{
	2751	fprintf (stderr, "%s\n", s);
	2752	@}
	2753	@end group
	2754
	2755	@group
	2756	int
	2757	main (int argc, char const* argv[])
	2758	@{
	2759	int i;
	2760	/* Enable parse traces on option -p. */
	2761	for (i = 1; i < argc; ++i)
	2762	if (!strcmp(argv[i], "-p"))
	2763	yydebug = 1;
	2764	init_table ();
	2765	return yyparse ();
	2766	@}
	2767	@end group
	2768	@end example
	2769
	2770	This program is both powerful and flexible. You may easily add new
	2771	functions, and it is a simple job to modify this code to install
	2772	predefined variables such as @code{pi} or @code{e} as well.
	2773
	2774	@node Exercises
	2775	@section Exercises
	2776	@cindex exercises
	2777
	2778	@enumerate
	2779	@item
	2780	Add some new functions from @file{math.h} to the initialization list.
	2781
	2782	@item
	2783	Add another array that contains constants and their values. Then
	2784	modify @code{init_table} to add these constants to the symbol table.
	2785	It will be easiest to give the constants type @code{VAR}.
	2786
	2787	@item
	2788	Make the program report an error if the user refers to an
	2789	uninitialized variable in any way except to store a value in it.
	2790	@end enumerate
	2791
	2792	@node Grammar File
	2793	@chapter Bison Grammar Files
	2794
	2795	Bison takes as input a context-free grammar specification and produces a
	2796	C-language function that recognizes correct instances of the grammar.
	2797
	2798	The Bison grammar file conventionally has a name ending in @samp{.y}.
	2799	@xref{Invocation, ,Invoking Bison}.
	2800
	2801	@menu
	2802	* Grammar Outline:: Overall layout of the grammar file.
	2803	* Symbols:: Terminal and nonterminal symbols.
	2804	* Rules:: How to write grammar rules.
	2805	* Semantics:: Semantic values and actions.
	2806	* Tracking Locations:: Locations and actions.
	2807	* Named References:: Using named references in actions.
	2808	* Declarations:: All kinds of Bison declarations are described here.
	2809	* Multiple Parsers:: Putting more than one Bison parser in one program.
	2810	@end menu
	2811
	2812	@node Grammar Outline
	2813	@section Outline of a Bison Grammar
	2814	@cindex comment
	2815	@findex // @dots{}
	2816	@findex /* @dots{} */
	2817
	2818	A Bison grammar file has four main sections, shown here with the
	2819	appropriate delimiters:
	2820
	2821	@example
	2822	%@{
	2823	@var{Prologue}
	2824	%@}
	2825
	2826	@var{Bison declarations}
	2827
	2828	%%
	2829	@var{Grammar rules}
	2830	%%
	2831
	2832	@var{Epilogue}
	2833	@end example
	2834
	2835	Comments enclosed in @samp{/* @dots{} */} may appear in any of the sections.
	2836	As a GNU extension, @samp{//} introduces a comment that continues until end
	2837	of line.
	2838
	2839	@menu
	2840	* Prologue:: Syntax and usage of the prologue.
	2841	* Prologue Alternatives:: Syntax and usage of alternatives to the prologue.
	2842	* Bison Declarations:: Syntax and usage of the Bison declarations section.
	2843	* Grammar Rules:: Syntax and usage of the grammar rules section.
	2844	* Epilogue:: Syntax and usage of the epilogue.
	2845	@end menu
	2846
	2847	@node Prologue
	2848	@subsection The prologue
	2849	@cindex declarations section
	2850	@cindex Prologue
	2851	@cindex declarations
	2852
	2853	The @var{Prologue} section contains macro definitions and declarations
	2854	of functions and variables that are used in the actions in the grammar
	2855	rules. These are copied to the beginning of the parser implementation
	2856	file so that they precede the definition of @code{yyparse}. You can
	2857	use @samp{#include} to get the declarations from a header file. If
	2858	you don't need any C declarations, you may omit the @samp{%@{} and
	2859	@samp{%@}} delimiters that bracket this section.
	2860
	2861	The @var{Prologue} section is terminated by the first occurrence
	2862	of @samp{%@}} that is outside a comment, a string literal, or a
	2863	character constant.
	2864
	2865	You may have more than one @var{Prologue} section, intermixed with the
	2866	@var{Bison declarations}. This allows you to have C and Bison
	2867	declarations that refer to each other. For example, the @code{%union}
	2868	declaration may use types defined in a header file, and you may wish to
	2869	prototype functions that take arguments of type @code{YYSTYPE}. This
	2870	can be done with two @var{Prologue} blocks, one before and one after the
	2871	@code{%union} declaration.
	2872
	2873	@example
	2874	@group
	2875	%@{
	2876	#define _GNU_SOURCE
	2877	#include <stdio.h>
	2878	#include "ptypes.h"
	2879	%@}
	2880	@end group
	2881
	2882	@group
	2883	%union @{
	2884	long int n;
	2885	tree t; /* @r{@code{tree} is defined in @file{ptypes.h}.} */
	2886	@}
	2887	@end group
	2888
	2889	@group
	2890	%@{
	2891	static void print_token_value (FILE *, int, YYSTYPE);
	2892	#define YYPRINT(F, N, L) print_token_value (F, N, L)
	2893	%@}
	2894	@end group
	2895
	2896	@dots{}
	2897	@end example
	2898
	2899	When in doubt, it is usually safer to put prologue code before all
	2900	Bison declarations, rather than after. For example, any definitions
	2901	of feature test macros like @code{_GNU_SOURCE} or
	2902	@code{_POSIX_C_SOURCE} should appear before all Bison declarations, as
	2903	feature test macros can affect the behavior of Bison-generated
	2904	@code{#include} directives.
	2905
	2906	@node Prologue Alternatives
	2907	@subsection Prologue Alternatives
	2908	@cindex Prologue Alternatives
	2909
	2910	@findex %code
	2911	@findex %code requires
	2912	@findex %code provides
	2913	@findex %code top
	2914
	2915	The functionality of @var{Prologue} sections can often be subtle and
	2916	inflexible. As an alternative, Bison provides a @code{%code}
	2917	directive with an explicit qualifier field, which identifies the
	2918	purpose of the code and thus the location(s) where Bison should
	2919	generate it. For C/C++, the qualifier can be omitted for the default
	2920	location, or it can be one of @code{requires}, @code{provides},
	2921	@code{top}. @xref{%code Summary}.
	2922
	2923	Look again at the example of the previous section:
	2924
	2925	@example
	2926	@group
	2927	%@{
	2928	#define _GNU_SOURCE
	2929	#include <stdio.h>
	2930	#include "ptypes.h"
	2931	%@}
	2932	@end group
	2933
	2934	@group
	2935	%union @{
	2936	long int n;
	2937	tree t; /* @r{@code{tree} is defined in @file{ptypes.h}.} */
	2938	@}
	2939	@end group
	2940
	2941	@group
	2942	%@{
	2943	static void print_token_value (FILE *, int, YYSTYPE);
	2944	#define YYPRINT(F, N, L) print_token_value (F, N, L)
	2945	%@}
	2946	@end group
	2947
	2948	@dots{}
	2949	@end example
	2950
	2951	@noindent
	2952	Notice that there are two @var{Prologue} sections here, but there's a
	2953	subtle distinction between their functionality. For example, if you
	2954	decide to override Bison's default definition for @code{YYLTYPE}, in
	2955	which @var{Prologue} section should you write your new definition?
	2956	You should write it in the first since Bison will insert that code
	2957	into the parser implementation file @emph{before} the default
	2958	@code{YYLTYPE} definition. In which @var{Prologue} section should you
	2959	prototype an internal function, @code{trace_token}, that accepts
	2960	@code{YYLTYPE} and @code{yytokentype} as arguments? You should
	2961	prototype it in the second since Bison will insert that code
	2962	@emph{after} the @code{YYLTYPE} and @code{yytokentype} definitions.
	2963
	2964	This distinction in functionality between the two @var{Prologue} sections is
	2965	established by the appearance of the @code{%union} between them.
	2966	This behavior raises a few questions.
	2967	First, why should the position of a @code{%union} affect definitions related to
	2968	@code{YYLTYPE} and @code{yytokentype}?
	2969	Second, what if there is no @code{%union}?
	2970	In that case, the second kind of @var{Prologue} section is not available.
	2971	This behavior is not intuitive.
	2972
	2973	To avoid this subtle @code{%union} dependency, rewrite the example using a
	2974	@code{%code top} and an unqualified @code{%code}.
	2975	Let's go ahead and add the new @code{YYLTYPE} definition and the
	2976	@code{trace_token} prototype at the same time:
	2977
	2978	@example
	2979	%code top @{
	2980	#define _GNU_SOURCE
	2981	#include <stdio.h>
	2982
	2983	/* WARNING: The following code really belongs
	2984	* in a '%code requires'; see below. */
	2985
	2986	#include "ptypes.h"
	2987	#define YYLTYPE YYLTYPE
	2988	typedef struct YYLTYPE
	2989	@{
	2990	int first_line;
	2991	int first_column;
	2992	int last_line;
	2993	int last_column;
	2994	char *filename;
	2995	@} YYLTYPE;
	2996	@}
	2997
	2998	@group
	2999	%union @{
	3000	long int n;
	3001	tree t; /* @r{@code{tree} is defined in @file{ptypes.h}.} */
	3002	@}
	3003	@end group
	3004
	3005	@group
	3006	%code @{
	3007	static void print_token_value (FILE *, int, YYSTYPE);
	3008	#define YYPRINT(F, N, L) print_token_value (F, N, L)
	3009	static void trace_token (enum yytokentype token, YYLTYPE loc);
	3010	@}
	3011	@end group
	3012
	3013	@dots{}
	3014	@end example
	3015
	3016	@noindent
	3017	In this way, @code{%code top} and the unqualified @code{%code} achieve the same
	3018	functionality as the two kinds of @var{Prologue} sections, but it's always
	3019	explicit which kind you intend.
	3020	Moreover, both kinds are always available even in the absence of @code{%union}.
	3021
	3022	The @code{%code top} block above logically contains two parts. The
	3023	first two lines before the warning need to appear near the top of the
	3024	parser implementation file. The first line after the warning is
	3025	required by @code{YYSTYPE} and thus also needs to appear in the parser
	3026	implementation file. However, if you've instructed Bison to generate
	3027	a parser header file (@pxref{Decl Summary, ,%defines}), you probably
	3028	want that line to appear before the @code{YYSTYPE} definition in that
	3029	header file as well. The @code{YYLTYPE} definition should also appear
	3030	in the parser header file to override the default @code{YYLTYPE}
	3031	definition there.
	3032
	3033	In other words, in the @code{%code top} block above, all but the first two
	3034	lines are dependency code required by the @code{YYSTYPE} and @code{YYLTYPE}
	3035	definitions.
	3036	Thus, they belong in one or more @code{%code requires}:
	3037
	3038	@example
	3039	@group
	3040	%code top @{
	3041	#define _GNU_SOURCE
	3042	#include <stdio.h>
	3043	@}
	3044	@end group
	3045
	3046	@group
	3047	%code requires @{
	3048	#include "ptypes.h"
	3049	@}
	3050	@end group
	3051	@group
	3052	%union @{
	3053	long int n;
	3054	tree t; /* @r{@code{tree} is defined in @file{ptypes.h}.} */
	3055	@}
	3056	@end group
	3057
	3058	@group
	3059	%code requires @{
	3060	#define YYLTYPE YYLTYPE
	3061	typedef struct YYLTYPE
	3062	@{
	3063	int first_line;
	3064	int first_column;
	3065	int last_line;
	3066	int last_column;
	3067	char *filename;
	3068	@} YYLTYPE;
	3069	@}
	3070	@end group
	3071
	3072	@group
	3073	%code @{
	3074	static void print_token_value (FILE *, int, YYSTYPE);
	3075	#define YYPRINT(F, N, L) print_token_value (F, N, L)
	3076	static void trace_token (enum yytokentype token, YYLTYPE loc);
	3077	@}
	3078	@end group
	3079
	3080	@dots{}
	3081	@end example
	3082
	3083	@noindent
	3084	Now Bison will insert @code{#include "ptypes.h"} and the new
	3085	@code{YYLTYPE} definition before the Bison-generated @code{YYSTYPE}
	3086	and @code{YYLTYPE} definitions in both the parser implementation file
	3087	and the parser header file. (By the same reasoning, @code{%code
	3088	requires} would also be the appropriate place to write your own
	3089	definition for @code{YYSTYPE}.)
	3090
	3091	When you are writing dependency code for @code{YYSTYPE} and
	3092	@code{YYLTYPE}, you should prefer @code{%code requires} over
	3093	@code{%code top} regardless of whether you instruct Bison to generate
	3094	a parser header file. When you are writing code that you need Bison
	3095	to insert only into the parser implementation file and that has no
	3096	special need to appear at the top of that file, you should prefer the
	3097	unqualified @code{%code} over @code{%code top}. These practices will
	3098	make the purpose of each block of your code explicit to Bison and to
	3099	other developers reading your grammar file. Following these
	3100	practices, we expect the unqualified @code{%code} and @code{%code
	3101	requires} to be the most important of the four @var{Prologue}
	3102	alternatives.
	3103
	3104	At some point while developing your parser, you might decide to
	3105	provide @code{trace_token} to modules that are external to your
	3106	parser. Thus, you might wish for Bison to insert the prototype into
	3107	both the parser header file and the parser implementation file. Since
	3108	this function is not a dependency required by @code{YYSTYPE} or
	3109	@code{YYLTYPE}, it doesn't make sense to move its prototype to a
	3110	@code{%code requires}. More importantly, since it depends upon
	3111	@code{YYLTYPE} and @code{yytokentype}, @code{%code requires} is not
	3112	sufficient. Instead, move its prototype from the unqualified
	3113	@code{%code} to a @code{%code provides}:
	3114
	3115	@example
	3116	@group
	3117	%code top @{
	3118	#define _GNU_SOURCE
	3119	#include <stdio.h>
	3120	@}
	3121	@end group
	3122
	3123	@group
	3124	%code requires @{
	3125	#include "ptypes.h"
	3126	@}
	3127	@end group
	3128	@group
	3129	%union @{
	3130	long int n;
	3131	tree t; /* @r{@code{tree} is defined in @file{ptypes.h}.} */
	3132	@}
	3133	@end group
	3134
	3135	@group
	3136	%code requires @{
	3137	#define YYLTYPE YYLTYPE
	3138	typedef struct YYLTYPE
	3139	@{
	3140	int first_line;
	3141	int first_column;
	3142	int last_line;
	3143	int last_column;
	3144	char *filename;
	3145	@} YYLTYPE;
	3146	@}
	3147	@end group
	3148
	3149	@group
	3150	%code provides @{
	3151	void trace_token (enum yytokentype token, YYLTYPE loc);
	3152	@}
	3153	@end group
	3154
	3155	@group
	3156	%code @{
	3157	static void print_token_value (FILE *, int, YYSTYPE);
	3158	#define YYPRINT(F, N, L) print_token_value (F, N, L)
	3159	@}
	3160	@end group
	3161
	3162	@dots{}
	3163	@end example
	3164
	3165	@noindent
	3166	Bison will insert the @code{trace_token} prototype into both the
	3167	parser header file and the parser implementation file after the
	3168	definitions for @code{yytokentype}, @code{YYLTYPE}, and
	3169	@code{YYSTYPE}.
	3170
	3171	The above examples are careful to write directives in an order that
	3172	reflects the layout of the generated parser implementation and header
	3173	files: @code{%code top}, @code{%code requires}, @code{%code provides},
	3174	and then @code{%code}. While your grammar files may generally be
	3175	easier to read if you also follow this order, Bison does not require
	3176	it. Instead, Bison lets you choose an organization that makes sense
	3177	to you.
	3178
	3179	You may declare any of these directives multiple times in the grammar file.
	3180	In that case, Bison concatenates the contained code in declaration order.
	3181	This is the only way in which the position of one of these directives within
	3182	the grammar file affects its functionality.
	3183
	3184	The result of the previous two properties is greater flexibility in how you may
	3185	organize your grammar file.
	3186	For example, you may organize semantic-type-related directives by semantic
	3187	type:
	3188
	3189	@example
	3190	@group
	3191	%code requires @{ #include "type1.h" @}
	3192	%union @{ type1 field1; @}
	3193	%destructor @{ type1_free ($$); @} <field1>
	3194	%printer @{ type1_print (yyoutput, $$); @} <field1>
	3195	@end group
	3196
	3197	@group
	3198	%code requires @{ #include "type2.h" @}
	3199	%union @{ type2 field2; @}
	3200	%destructor @{ type2_free ($$); @} <field2>
	3201	%printer @{ type2_print (yyoutput, $$); @} <field2>
	3202	@end group
	3203	@end example
	3204
	3205	@noindent
	3206	You could even place each of the above directive groups in the rules section of
	3207	the grammar file next to the set of rules that uses the associated semantic
	3208	type.
	3209	(In the rules section, you must terminate each of those directives with a
	3210	semicolon.)
	3211	And you don't have to worry that some directive (like a @code{%union}) in the
	3212	definitions section is going to adversely affect their functionality in some
	3213	counter-intuitive manner just because it comes first.
	3214	Such an organization is not possible using @var{Prologue} sections.
	3215
	3216	This section has been concerned with explaining the advantages of the four
	3217	@var{Prologue} alternatives over the original Yacc @var{Prologue}.
	3218	However, in most cases when using these directives, you shouldn't need to
	3219	think about all the low-level ordering issues discussed here.
	3220	Instead, you should simply use these directives to label each block of your
	3221	code according to its purpose and let Bison handle the ordering.
	3222	@code{%code} is the most generic label.
	3223	Move code to @code{%code requires}, @code{%code provides}, or @code{%code top}
	3224	as needed.
	3225
	3226	@node Bison Declarations
	3227	@subsection The Bison Declarations Section
	3228	@cindex Bison declarations (introduction)
	3229	@cindex declarations, Bison (introduction)
	3230
	3231	The @var{Bison declarations} section contains declarations that define
	3232	terminal and nonterminal symbols, specify precedence, and so on.
	3233	In some simple grammars you may not need any declarations.
	3234	@xref{Declarations, ,Bison Declarations}.
	3235
	3236	@node Grammar Rules
	3237	@subsection The Grammar Rules Section
	3238	@cindex grammar rules section
	3239	@cindex rules section for grammar
	3240
	3241	The @dfn{grammar rules} section contains one or more Bison grammar
	3242	rules, and nothing else. @xref{Rules, ,Syntax of Grammar Rules}.
	3243
	3244	There must always be at least one grammar rule, and the first
	3245	@samp{%%} (which precedes the grammar rules) may never be omitted even
	3246	if it is the first thing in the file.
	3247
	3248	@node Epilogue
	3249	@subsection The epilogue
	3250	@cindex additional C code section
	3251	@cindex epilogue
	3252	@cindex C code, section for additional
	3253
	3254	The @var{Epilogue} is copied verbatim to the end of the parser
	3255	implementation file, just as the @var{Prologue} is copied to the
	3256	beginning. This is the most convenient place to put anything that you
	3257	want to have in the parser implementation file but which need not come
	3258	before the definition of @code{yyparse}. For example, the definitions
	3259	of @code{yylex} and @code{yyerror} often go here. Because C requires
	3260	functions to be declared before being used, you often need to declare
	3261	functions like @code{yylex} and @code{yyerror} in the Prologue, even
	3262	if you define them in the Epilogue. @xref{Interface, ,Parser
	3263	C-Language Interface}.
	3264
	3265	If the last section is empty, you may omit the @samp{%%} that separates it
	3266	from the grammar rules.
	3267
	3268	The Bison parser itself contains many macros and identifiers whose names
	3269	start with @samp{yy} or @samp{YY}, so it is a good idea to avoid using
	3270	any such names (except those documented in this manual) in the epilogue
	3271	of the grammar file.
	3272
	3273	@node Symbols
	3274	@section Symbols, Terminal and Nonterminal
	3275	@cindex nonterminal symbol
	3276	@cindex terminal symbol
	3277	@cindex token type
	3278	@cindex symbol
	3279
	3280	@dfn{Symbols} in Bison grammars represent the grammatical classifications
	3281	of the language.
	3282
	3283	A @dfn{terminal symbol} (also known as a @dfn{token type}) represents a
	3284	class of syntactically equivalent tokens. You use the symbol in grammar
	3285	rules to mean that a token in that class is allowed. The symbol is
	3286	represented in the Bison parser by a numeric code, and the @code{yylex}
	3287	function returns a token type code to indicate what kind of token has
	3288	been read. You don't need to know what the code value is; you can use
	3289	the symbol to stand for it.
	3290
	3291	A @dfn{nonterminal symbol} stands for a class of syntactically
	3292	equivalent groupings. The symbol name is used in writing grammar rules.
	3293	By convention, it should be all lower case.
	3294
	3295	Symbol names can contain letters, underscores, periods, and non-initial
	3296	digits and dashes. Dashes in symbol names are a GNU extension, incompatible
	3297	with POSIX Yacc. Periods and dashes make symbol names less convenient to
	3298	use with named references, which require brackets around such names
	3299	(@pxref{Named References}). Terminal symbols that contain periods or dashes
	3300	make little sense: since they are not valid symbols (in most programming
	3301	languages) they are not exported as token names.
	3302
	3303	There are three ways of writing terminal symbols in the grammar:
	3304
	3305	@itemize @bullet
	3306	@item
	3307	A @dfn{named token type} is written with an identifier, like an
	3308	identifier in C@. By convention, it should be all upper case. Each
	3309	such name must be defined with a Bison declaration such as
	3310	@code{%token}. @xref{Token Decl, ,Token Type Names}.
	3311
	3312	@item
	3313	@cindex character token
	3314	@cindex literal token
	3315	@cindex single-character literal
	3316	A @dfn{character token type} (or @dfn{literal character token}) is
	3317	written in the grammar using the same syntax used in C for character
	3318	constants; for example, @code{'+'} is a character token type. A
	3319	character token type doesn't need to be declared unless you need to
	3320	specify its semantic value data type (@pxref{Value Type, ,Data Types of
	3321	Semantic Values}), associativity, or precedence (@pxref{Precedence,
	3322	,Operator Precedence}).
	3323
	3324	By convention, a character token type is used only to represent a
	3325	token that consists of that particular character. Thus, the token
	3326	type @code{'+'} is used to represent the character @samp{+} as a
	3327	token. Nothing enforces this convention, but if you depart from it,
	3328	your program will confuse other readers.
	3329
	3330	All the usual escape sequences used in character literals in C can be
	3331	used in Bison as well, but you must not use the null character as a
	3332	character literal because its numeric code, zero, signifies
	3333	end-of-input (@pxref{Calling Convention, ,Calling Convention
	3334	for @code{yylex}}). Also, unlike standard C, trigraphs have no
	3335	special meaning in Bison character literals, nor is backslash-newline
	3336	allowed.
	3337
	3338	@item
	3339	@cindex string token
	3340	@cindex literal string token
	3341	@cindex multicharacter literal
	3342	A @dfn{literal string token} is written like a C string constant; for
	3343	example, @code{"<="} is a literal string token. A literal string token
	3344	doesn't need to be declared unless you need to specify its semantic
	3345	value data type (@pxref{Value Type}), associativity, or precedence
	3346	(@pxref{Precedence}).
	3347
	3348	You can associate the literal string token with a symbolic name as an
	3349	alias, using the @code{%token} declaration (@pxref{Token Decl, ,Token
	3350	Declarations}). If you don't do that, the lexical analyzer has to
	3351	retrieve the token number for the literal string token from the
	3352	@code{yytname} table (@pxref{Calling Convention}).
	3353
	3354	@strong{Warning}: literal string tokens do not work in Yacc.
	3355
	3356	By convention, a literal string token is used only to represent a token
	3357	that consists of that particular string. Thus, you should use the token
	3358	type @code{"<="} to represent the string @samp{<=} as a token. Bison
	3359	does not enforce this convention, but if you depart from it, people who
	3360	read your program will be confused.
	3361
	3362	All the escape sequences used in string literals in C can be used in
	3363	Bison as well, except that you must not use a null character within a
	3364	string literal. Also, unlike Standard C, trigraphs have no special
	3365	meaning in Bison string literals, nor is backslash-newline allowed. A
	3366	literal string token must contain two or more characters; for a token
	3367	containing just one character, use a character token (see above).
	3368	@end itemize
	3369
	3370	How you choose to write a terminal symbol has no effect on its
	3371	grammatical meaning. That depends only on where it appears in rules and
	3372	on when the parser function returns that symbol.
	3373
	3374	The value returned by @code{yylex} is always one of the terminal
	3375	symbols, except that a zero or negative value signifies end-of-input.
	3376	Whichever way you write the token type in the grammar rules, you write
	3377	it the same way in the definition of @code{yylex}. The numeric code
	3378	for a character token type is simply the positive numeric code of the
	3379	character, so @code{yylex} can use the identical value to generate the
	3380	requisite code, though you may need to convert it to @code{unsigned
	3381	char} to avoid sign-extension on hosts where @code{char} is signed.
	3382	Each named token type becomes a C macro in the parser implementation
	3383	file, so @code{yylex} can use the name to stand for the code. (This
	3384	is why periods don't make sense in terminal symbols.) @xref{Calling
	3385	Convention, ,Calling Convention for @code{yylex}}.
	3386
	3387	If @code{yylex} is defined in a separate file, you need to arrange for the
	3388	token-type macro definitions to be available there. Use the @samp{-d}
	3389	option when you run Bison, so that it will write these macro definitions
	3390	into a separate header file @file{@var{name}.tab.h} which you can include
	3391	in the other source files that need it. @xref{Invocation, ,Invoking Bison}.
	3392
	3393	If you want to write a grammar that is portable to any Standard C
	3394	host, you must use only nonnull character tokens taken from the basic
	3395	execution character set of Standard C@. This set consists of the ten
	3396	digits, the 52 lower- and upper-case English letters, and the
	3397	characters in the following C-language string:
	3398
	3399	@example
	3400	"\a\b\t\n\v\f\r !\"#%&'()*+,-./:;<=>?[\\]^_@{\|@}~"
	3401	@end example
	3402
	3403	The @code{yylex} function and Bison must use a consistent character set
	3404	and encoding for character tokens. For example, if you run Bison in an
	3405	ASCII environment, but then compile and run the resulting
	3406	program in an environment that uses an incompatible character set like
	3407	EBCDIC, the resulting program may not work because the tables
	3408	generated by Bison will assume ASCII numeric values for
	3409	character tokens. It is standard practice for software distributions to
	3410	contain C source files that were generated by Bison in an
	3411	ASCII environment, so installers on platforms that are
	3412	incompatible with ASCII must rebuild those files before
	3413	compiling them.
	3414
	3415	The symbol @code{error} is a terminal symbol reserved for error recovery
	3416	(@pxref{Error Recovery}); you shouldn't use it for any other purpose.
	3417	In particular, @code{yylex} should never return this value. The default
	3418	value of the error token is 256, unless you explicitly assigned 256 to
	3419	one of your tokens with a @code{%token} declaration.
	3420
	3421	@node Rules
	3422	@section Grammar Rules
	3423
	3424	A Bison grammar is a list of rules.
	3425
	3426	@menu
	3427	* Rules Syntax:: Syntax of the rules.
	3428	* Empty Rules:: Symbols that can match the empty string.
	3429	* Recursion:: Writing recursive rules.
	3430	@end menu
	3431
	3432	@node Rules Syntax
	3433	@subsection Syntax of Grammar Rules
	3434	@cindex rule syntax
	3435	@cindex grammar rule syntax
	3436	@cindex syntax of grammar rules
	3437
	3438	A Bison grammar rule has the following general form:
	3439
	3440	@example
	3441	@var{result}: @var{components}@dots{};
	3442	@end example
	3443
	3444	@noindent
	3445	where @var{result} is the nonterminal symbol that this rule describes,
	3446	and @var{components} are various terminal and nonterminal symbols that
	3447	are put together by this rule (@pxref{Symbols}).
	3448
	3449	For example,
	3450
	3451	@example
	3452	exp: exp '+' exp;
	3453	@end example
	3454
	3455	@noindent
	3456	says that two groupings of type @code{exp}, with a @samp{+} token in between,
	3457	can be combined into a larger grouping of type @code{exp}.
	3458
	3459	White space in rules is significant only to separate symbols. You can add
	3460	extra white space as you wish.
	3461
	3462	Scattered among the components can be @var{actions} that determine
	3463	the semantics of the rule. An action looks like this:
	3464
	3465	@example
	3466	@{@var{C statements}@}
	3467	@end example
	3468
	3469	@noindent
	3470	@cindex braced code
	3471	This is an example of @dfn{braced code}, that is, C code surrounded by
	3472	braces, much like a compound statement in C@. Braced code can contain
	3473	any sequence of C tokens, so long as its braces are balanced. Bison
	3474	does not check the braced code for correctness directly; it merely
	3475	copies the code to the parser implementation file, where the C
	3476	compiler can check it.
	3477
	3478	Within braced code, the balanced-brace count is not affected by braces
	3479	within comments, string literals, or character constants, but it is
	3480	affected by the C digraphs @samp{<%} and @samp{%>} that represent
	3481	braces. At the top level braced code must be terminated by @samp{@}}
	3482	and not by a digraph. Bison does not look for trigraphs, so if braced
	3483	code uses trigraphs you should ensure that they do not affect the
	3484	nesting of braces or the boundaries of comments, string literals, or
	3485	character constants.
	3486
	3487	Usually there is only one action and it follows the components.
	3488	@xref{Actions}.
	3489
	3490	@findex \|
	3491	Multiple rules for the same @var{result} can be written separately or can
	3492	be joined with the vertical-bar character @samp{\|} as follows:
	3493
	3494	@example
	3495	@group
	3496	@var{result}:
	3497	@var{rule1-components}@dots{}
	3498	\| @var{rule2-components}@dots{}
	3499	@dots{}
	3500	;
	3501	@end group
	3502	@end example
	3503
	3504	@noindent
	3505	They are still considered distinct rules even when joined in this way.
	3506
	3507	@node Empty Rules
	3508	@subsection Empty Rules
	3509	@cindex empty rule
	3510	@cindex rule, empty
	3511	@findex %empty
	3512
	3513	A rule is said to be @dfn{empty} if its right-hand side (@var{components})
	3514	is empty. It means that @var{result} can match the empty string. For
	3515	example, here is how to define an optional semicolon:
	3516
	3517	@example
	3518	semicolon.opt: \| ";";
	3519	@end example
	3520
	3521	@noindent
	3522	It is easy not to see an empty rule, especially when @code{\|} is used. The
	3523	@code{%empty} directive allows to make explicit that a rule is empty on
	3524	purpose:
	3525
	3526	@example
	3527	@group
	3528	semicolon.opt:
	3529	%empty
	3530	\| ";"
	3531	;
	3532	@end group
	3533	@end example
	3534
	3535	Flagging a non-empty rule with @code{%empty} is an error. If run with
	3536	@option{-Wempty-rule}, @command{bison} will report empty rules without
	3537	@code{%empty}. Using @code{%empty} enables this warning, unless
	3538	@option{-Wno-empty-rule} was specified.
	3539
	3540	The @code{%empty} directive is a Bison extension, it does not work with
	3541	Yacc. To remain compatible with POSIX Yacc, it is customary to write a
	3542	comment @samp{/* empty */} in each rule with no components:
	3543
	3544	@example
	3545	@group
	3546	semicolon.opt:
	3547	/* empty */
	3548	\| ";"
	3549	;
	3550	@end group
	3551	@end example
	3552
	3553
	3554	@node Recursion
	3555	@subsection Recursive Rules
	3556	@cindex recursive rule
	3557	@cindex rule, recursive
	3558
	3559	A rule is called @dfn{recursive} when its @var{result} nonterminal
	3560	appears also on its right hand side. Nearly all Bison grammars need to
	3561	use recursion, because that is the only way to define a sequence of any
	3562	number of a particular thing. Consider this recursive definition of a
	3563	comma-separated sequence of one or more expressions:
	3564
	3565	@example
	3566	@group
	3567	expseq1:
	3568	exp
	3569	\| expseq1 ',' exp
	3570	;
	3571	@end group
	3572	@end example
	3573
	3574	@cindex left recursion
	3575	@cindex right recursion
	3576	@noindent
	3577	Since the recursive use of @code{expseq1} is the leftmost symbol in the
	3578	right hand side, we call this @dfn{left recursion}. By contrast, here
	3579	the same construct is defined using @dfn{right recursion}:
	3580
	3581	@example
	3582	@group
	3583	expseq1:
	3584	exp
	3585	\| exp ',' expseq1
	3586	;
	3587	@end group
	3588	@end example
	3589
	3590	@noindent
	3591	Any kind of sequence can be defined using either left recursion or right
	3592	recursion, but you should always use left recursion, because it can
	3593	parse a sequence of any number of elements with bounded stack space.
	3594	Right recursion uses up space on the Bison stack in proportion to the
	3595	number of elements in the sequence, because all the elements must be
	3596	shifted onto the stack before the rule can be applied even once.
	3597	@xref{Algorithm, ,The Bison Parser Algorithm}, for further explanation
	3598	of this.
	3599
	3600	@cindex mutual recursion
	3601	@dfn{Indirect} or @dfn{mutual} recursion occurs when the result of the
	3602	rule does not appear directly on its right hand side, but does appear
	3603	in rules for other nonterminals which do appear on its right hand
	3604	side.
	3605
	3606	For example:
	3607
	3608	@example
	3609	@group
	3610	expr:
	3611	primary
	3612	\| primary '+' primary
	3613	;
	3614	@end group
	3615
	3616	@group
	3617	primary:
	3618	constant
	3619	\| '(' expr ')'
	3620	;
	3621	@end group
	3622	@end example
	3623
	3624	@noindent
	3625	defines two mutually-recursive nonterminals, since each refers to the
	3626	other.
	3627
	3628	@node Semantics
	3629	@section Defining Language Semantics
	3630	@cindex defining language semantics
	3631	@cindex language semantics, defining
	3632
	3633	The grammar rules for a language determine only the syntax. The semantics
	3634	are determined by the semantic values associated with various tokens and
	3635	groupings, and by the actions taken when various groupings are recognized.
	3636
	3637	For example, the calculator calculates properly because the value
	3638	associated with each expression is the proper number; it adds properly
	3639	because the action for the grouping @w{@samp{@var{x} + @var{y}}} is to add
	3640	the numbers associated with @var{x} and @var{y}.
	3641
	3642	@menu
	3643	* Value Type:: Specifying one data type for all semantic values.
	3644	* Multiple Types:: Specifying several alternative data types.
	3645	* Type Generation:: Generating the semantic value type.
	3646	* Union Decl:: Declaring the set of all semantic value types.
	3647	* Structured Value Type:: Providing a structured semantic value type.
	3648	* Actions:: An action is the semantic definition of a grammar rule.
	3649	* Action Types:: Specifying data types for actions to operate on.
	3650	* Mid-Rule Actions:: Most actions go at the end of a rule.
	3651	This says when, why and how to use the exceptional
	3652	action in the middle of a rule.
	3653	@end menu
	3654
	3655	@node Value Type
	3656	@subsection Data Types of Semantic Values
	3657	@cindex semantic value type
	3658	@cindex value type, semantic
	3659	@cindex data types of semantic values
	3660	@cindex default data type
	3661
	3662	In a simple program it may be sufficient to use the same data type for
	3663	the semantic values of all language constructs. This was true in the
	3664	RPN and infix calculator examples (@pxref{RPN Calc, ,Reverse Polish
	3665	Notation Calculator}).
	3666
	3667	Bison normally uses the type @code{int} for semantic values if your
	3668	program uses the same data type for all language constructs. To
	3669	specify some other type, define the @code{%define} variable
	3670	@code{api.value.type} like this:
	3671
	3672	@example
	3673	%define api.value.type @{double@}
	3674	@end example
	3675
	3676	@noindent
	3677	or
	3678
	3679	@example
	3680	%define api.value.type @{struct semantic_type@}
	3681	@end example
	3682
	3683	The value of @code{api.value.type} should be a type name that does not
	3684	contain parentheses or square brackets.
	3685
	3686	Alternatively, instead of relying of Bison's @code{%define} support, you may
	3687	rely on the C/C++ preprocessor and define @code{YYSTYPE} as a macro, like
	3688	this:
	3689
	3690	@example
	3691	#define YYSTYPE double
	3692	@end example
	3693
	3694	@noindent
	3695	This macro definition must go in the prologue of the grammar file
	3696	(@pxref{Grammar Outline, ,Outline of a Bison Grammar}). If compatibility
	3697	with POSIX Yacc matters to you, use this. Note however that Bison cannot
	3698	know @code{YYSTYPE}'s value, not even whether it is defined, so there are
	3699	services it cannot provide. Besides this works only for languages that have
	3700	a preprocessor.
	3701
	3702	@node Multiple Types
	3703	@subsection More Than One Value Type
	3704
	3705	In most programs, you will need different data types for different kinds
	3706	of tokens and groupings. For example, a numeric constant may need type
	3707	@code{int} or @code{long int}, while a string constant needs type
	3708	@code{char *}, and an identifier might need a pointer to an entry in the
	3709	symbol table.
	3710
	3711	To use more than one data type for semantic values in one parser, Bison
	3712	requires you to do two things:
	3713
	3714	@itemize @bullet
	3715	@item
	3716	Specify the entire collection of possible data types. There are several
	3717	options:
	3718	@itemize @bullet
	3719	@item
	3720	let Bison compute the union type from the tags you assign to symbols;
	3721
	3722	@item
	3723	use the @code{%union} Bison declaration (@pxref{Union Decl, ,The Union
	3724	Declaration});
	3725
	3726	@item
	3727	define the @code{%define} variable @code{api.value.type} to be a union type
	3728	whose members are the type tags (@pxref{Structured Value Type,, Providing a
	3729	Structured Semantic Value Type});
	3730
	3731	@item
	3732	use a @code{typedef} or a @code{#define} to define @code{YYSTYPE} to be a
	3733	union type whose member names are the type tags.
	3734	@end itemize
	3735
	3736	@item
	3737	Choose one of those types for each symbol (terminal or nonterminal) for
	3738	which semantic values are used. This is done for tokens with the
	3739	@code{%token} Bison declaration (@pxref{Token Decl, ,Token Type Names})
	3740	and for groupings with the @code{%type} Bison declaration (@pxref{Type
	3741	Decl, ,Nonterminal Symbols}).
	3742	@end itemize
	3743
	3744	@node Type Generation
	3745	@subsection Generating the Semantic Value Type
	3746	@cindex declaring value types
	3747	@cindex value types, declaring
	3748	@findex %define api.value.type union
	3749
	3750	The special value @code{union} of the @code{%define} variable
	3751	@code{api.value.type} instructs Bison that the tags used with the
	3752	@code{%token} and @code{%type} directives are genuine types, not names of
	3753	members of @code{YYSTYPE}.
	3754
	3755	For example:
	3756
	3757	@example
	3758	%define api.value.type union
	3759	%token <int> INT "integer"
	3760	%token <int> 'n'
	3761	%type <int> expr
	3762	%token <char const *> ID "identifier"
	3763	@end example
	3764
	3765	@noindent
	3766	generates an appropriate value of @code{YYSTYPE} to support each symbol
	3767	type. The name of the member of @code{YYSTYPE} for tokens than have a
	3768	declared identifier @var{id} (such as @code{INT} and @code{ID} above, but
	3769	not @code{'n'}) is @code{@var{id}}. The other symbols have unspecified
	3770	names on which you should not depend; instead, relying on C casts to access
	3771	the semantic value with the appropriate type:
	3772
	3773	@example
	3774	/* For an "integer". */
	3775	yylval.INT = 42;
	3776	return INT;
	3777
	3778	/* For an 'n', also declared as int. */
	3779	((int)&yylval) = 42;
	3780	return 'n';
	3781
	3782	/* For an "identifier". */
	3783	yylval.ID = "42";
	3784	return ID;
	3785	@end example
	3786
	3787	If the @code{%define} variable @code{api.token.prefix} is defined
	3788	(@pxref{%define Summary,,api.token.prefix}), then it is also used to prefix
	3789	the union member names. For instance, with @samp{%define api.token.prefix
	3790	@{TOK_@}}:
	3791
	3792	@example
	3793	/* For an "integer". */
	3794	yylval.TOK_INT = 42;
	3795	return TOK_INT;
	3796	@end example
	3797
	3798	This Bison extension cannot work if @code{%yacc} (or
	3799	@option{-y}/@option{--yacc}) is enabled, as POSIX mandates that Yacc
	3800	generate tokens as macros (e.g., @samp{#define INT 258}, or @samp{#define
	3801	TOK_INT 258}).
	3802
	3803	This feature is new, and user feedback would be most welcome.
	3804
	3805	A similar feature is provided for C++ that in addition overcomes C++
	3806	limitations (that forbid non-trivial objects to be part of a @code{union}):
	3807	@samp{%define api.value.type variant}, see @ref{C++ Variants}.
	3808
	3809	@node Union Decl
	3810	@subsection The Union Declaration
	3811	@cindex declaring value types
	3812	@cindex value types, declaring
	3813	@findex %union
	3814
	3815	The @code{%union} declaration specifies the entire collection of possible
	3816	data types for semantic values. The keyword @code{%union} is followed by
	3817	braced code containing the same thing that goes inside a @code{union} in C@.
	3818
	3819	For example:
	3820
	3821	@example
	3822	@group
	3823	%union @{
	3824	double val;
	3825	symrec *tptr;
	3826	@}
	3827	@end group
	3828	@end example
	3829
	3830	@noindent
	3831	This says that the two alternative types are @code{double} and @code{symrec
	3832	*}. They are given names @code{val} and @code{tptr}; these names are used
	3833	in the @code{%token} and @code{%type} declarations to pick one of the types
	3834	for a terminal or nonterminal symbol (@pxref{Type Decl, ,Nonterminal Symbols}).
	3835
	3836	As an extension to POSIX, a tag is allowed after the @code{%union}. For
	3837	example:
	3838
	3839	@example
	3840	@group
	3841	%union value @{
	3842	double val;
	3843	symrec *tptr;
	3844	@}
	3845	@end group
	3846	@end example
	3847
	3848	@noindent
	3849	specifies the union tag @code{value}, so the corresponding C type is
	3850	@code{union value}. If you do not specify a tag, it defaults to
	3851	@code{YYSTYPE}.
	3852
	3853	As another extension to POSIX, you may specify multiple @code{%union}
	3854	declarations; their contents are concatenated. However, only the first
	3855	@code{%union} declaration can specify a tag.
	3856
	3857	Note that, unlike making a @code{union} declaration in C, you need not write
	3858	a semicolon after the closing brace.
	3859
	3860	@node Structured Value Type
	3861	@subsection Providing a Structured Semantic Value Type
	3862	@cindex declaring value types
	3863	@cindex value types, declaring
	3864	@findex %union
	3865
	3866	Instead of @code{%union}, you can define and use your own union type
	3867	@code{YYSTYPE} if your grammar contains at least one @samp{<@var{type}>}
	3868	tag. For example, you can put the following into a header file
	3869	@file{parser.h}:
	3870
	3871	@example
	3872	@group
	3873	union YYSTYPE @{
	3874	double val;
	3875	symrec *tptr;
	3876	@};
	3877	@end group
	3878	@end example
	3879
	3880	@noindent
	3881	and then your grammar can use the following instead of @code{%union}:
	3882
	3883	@example
	3884	@group
	3885	%@{
	3886	#include "parser.h"
	3887	%@}
	3888	%define api.value.type @{union YYSTYPE@}
	3889	%type <val> expr
	3890	%token <tptr> ID
	3891	@end group
	3892	@end example
	3893
	3894	Actually, you may also provide a @code{struct} rather that a @code{union},
	3895	which may be handy if you want to track information for every symbol (such
	3896	as preceding comments).
	3897
	3898	The type you provide may even be structured and include pointers, in which
	3899	case the type tags you provide may be composite, with @samp{.} and @samp{->}
	3900	operators.
	3901
	3902	@node Actions
	3903	@subsection Actions
	3904	@cindex action
	3905	@vindex $$
	3906	@vindex $@var{n}
	3907	@vindex $@var{name}
	3908	@vindex $[@var{name}]
	3909
	3910	An action accompanies a syntactic rule and contains C code to be executed
	3911	each time an instance of that rule is recognized. The task of most actions
	3912	is to compute a semantic value for the grouping built by the rule from the
	3913	semantic values associated with tokens or smaller groupings.
	3914
	3915	An action consists of braced code containing C statements, and can be
	3916	placed at any position in the rule;
	3917	it is executed at that position. Most rules have just one action at the
	3918	end of the rule, following all the components. Actions in the middle of
	3919	a rule are tricky and used only for special purposes (@pxref{Mid-Rule
	3920	Actions, ,Actions in Mid-Rule}).
	3921
	3922	The C code in an action can refer to the semantic values of the
	3923	components matched by the rule with the construct @code{$@var{n}},
	3924	which stands for the value of the @var{n}th component. The semantic
	3925	value for the grouping being constructed is @code{$$}. In addition,
	3926	the semantic values of symbols can be accessed with the named
	3927	references construct @code{$@var{name}} or @code{$[@var{name}]}.
	3928	Bison translates both of these constructs into expressions of the
	3929	appropriate type when it copies the actions into the parser
	3930	implementation file. @code{$$} (or @code{$@var{name}}, when it stands
	3931	for the current grouping) is translated to a modifiable lvalue, so it
	3932	can be assigned to.
	3933
	3934	Here is a typical example:
	3935
	3936	@example
	3937	@group
	3938	exp:
	3939	@dots{}
	3940	\| exp '+' exp @{ $$ = $1 + $3; @}
	3941	@end group
	3942	@end example
	3943
	3944	Or, in terms of named references:
	3945
	3946	@example
	3947	@group
	3948	exp[result]:
	3949	@dots{}
	3950	\| exp[left] '+' exp[right] @{ $result = $left + $right; @}
	3951	@end group
	3952	@end example
	3953
	3954	@noindent
	3955	This rule constructs an @code{exp} from two smaller @code{exp} groupings
	3956	connected by a plus-sign token. In the action, @code{$1} and @code{$3}
	3957	(@code{$left} and @code{$right})
	3958	refer to the semantic values of the two component @code{exp} groupings,
	3959	which are the first and third symbols on the right hand side of the rule.
	3960	The sum is stored into @code{$$} (@code{$result}) so that it becomes the
	3961	semantic value of
	3962	the addition-expression just recognized by the rule. If there were a
	3963	useful semantic value associated with the @samp{+} token, it could be
	3964	referred to as @code{$2}.
	3965
	3966	@xref{Named References}, for more information about using the named
	3967	references construct.
	3968
	3969	Note that the vertical-bar character @samp{\|} is really a rule
	3970	separator, and actions are attached to a single rule. This is a
	3971	difference with tools like Flex, for which @samp{\|} stands for either
	3972	``or'', or ``the same action as that of the next rule''. In the
	3973	following example, the action is triggered only when @samp{b} is found:
	3974
	3975	@example
	3976	a-or-b: 'a'\|'b' @{ a_or_b_found = 1; @};
	3977	@end example
	3978
	3979	@cindex default action
	3980	If you don't specify an action for a rule, Bison supplies a default:
	3981	@w{@code{$$ = $1}.} Thus, the value of the first symbol in the rule
	3982	becomes the value of the whole rule. Of course, the default action is
	3983	valid only if the two data types match. There is no meaningful default
	3984	action for an empty rule; every empty rule must have an explicit action
	3985	unless the rule's value does not matter.
	3986
	3987	@code{$@var{n}} with @var{n} zero or negative is allowed for reference
	3988	to tokens and groupings on the stack @emph{before} those that match the
	3989	current rule. This is a very risky practice, and to use it reliably
	3990	you must be certain of the context in which the rule is applied. Here
	3991	is a case in which you can use this reliably:
	3992
	3993	@example
	3994	@group
	3995	foo:
	3996	expr bar '+' expr @{ @dots{} @}
	3997	\| expr bar '-' expr @{ @dots{} @}
	3998	;
	3999	@end group
	4000
	4001	@group
	4002	bar:
	4003	%empty @{ previous_expr = $0; @}
	4004	;
	4005	@end group
	4006	@end example
	4007
	4008	As long as @code{bar} is used only in the fashion shown here, @code{$0}
	4009	always refers to the @code{expr} which precedes @code{bar} in the
	4010	definition of @code{foo}.
	4011
	4012	@vindex yylval
	4013	It is also possible to access the semantic value of the lookahead token, if
	4014	any, from a semantic action.
	4015	This semantic value is stored in @code{yylval}.
	4016	@xref{Action Features, ,Special Features for Use in Actions}.
	4017
	4018	@node Action Types
	4019	@subsection Data Types of Values in Actions
	4020	@cindex action data types
	4021	@cindex data types in actions
	4022
	4023	If you have chosen a single data type for semantic values, the @code{$$}
	4024	and @code{$@var{n}} constructs always have that data type.
	4025
	4026	If you have used @code{%union} to specify a variety of data types, then you
	4027	must declare a choice among these types for each terminal or nonterminal
	4028	symbol that can have a semantic value. Then each time you use @code{$$} or
	4029	@code{$@var{n}}, its data type is determined by which symbol it refers to
	4030	in the rule. In this example,
	4031
	4032	@example
	4033	@group
	4034	exp:
	4035	@dots{}
	4036	\| exp '+' exp @{ $$ = $1 + $3; @}
	4037	@end group
	4038	@end example
	4039
	4040	@noindent
	4041	@code{$1} and @code{$3} refer to instances of @code{exp}, so they all
	4042	have the data type declared for the nonterminal symbol @code{exp}. If
	4043	@code{$2} were used, it would have the data type declared for the
	4044	terminal symbol @code{'+'}, whatever that might be.
	4045
	4046	Alternatively, you can specify the data type when you refer to the value,
	4047	by inserting @samp{<@var{type}>} after the @samp{$} at the beginning of the
	4048	reference. For example, if you have defined types as shown here:
	4049
	4050	@example
	4051	@group
	4052	%union @{
	4053	int itype;
	4054	double dtype;
	4055	@}
	4056	@end group
	4057	@end example
	4058
	4059	@noindent
	4060	then you can write @code{$<itype>1} to refer to the first subunit of the
	4061	rule as an integer, or @code{$<dtype>1} to refer to it as a double.
	4062
	4063	@node Mid-Rule Actions
	4064	@subsection Actions in Mid-Rule
	4065	@cindex actions in mid-rule
	4066	@cindex mid-rule actions
	4067
	4068	Occasionally it is useful to put an action in the middle of a rule.
	4069	These actions are written just like usual end-of-rule actions, but they
	4070	are executed before the parser even recognizes the following components.
	4071
	4072	@menu
	4073	* Using Mid-Rule Actions:: Putting an action in the middle of a rule.
	4074	* Mid-Rule Action Translation:: How mid-rule actions are actually processed.
	4075	* Mid-Rule Conflicts:: Mid-rule actions can cause conflicts.
	4076	@end menu
	4077
	4078	@node Using Mid-Rule Actions
	4079	@subsubsection Using Mid-Rule Actions
	4080
	4081	A mid-rule action may refer to the components preceding it using
	4082	@code{$@var{n}}, but it may not refer to subsequent components because
	4083	it is run before they are parsed.
	4084
	4085	The mid-rule action itself counts as one of the components of the rule.
	4086	This makes a difference when there is another action later in the same rule
	4087	(and usually there is another at the end): you have to count the actions
	4088	along with the symbols when working out which number @var{n} to use in
	4089	@code{$@var{n}}.
	4090
	4091	The mid-rule action can also have a semantic value. The action can set
	4092	its value with an assignment to @code{$$}, and actions later in the rule
	4093	can refer to the value using @code{$@var{n}}. Since there is no symbol
	4094	to name the action, there is no way to declare a data type for the value
	4095	in advance, so you must use the @samp{$<@dots{}>@var{n}} construct to
	4096	specify a data type each time you refer to this value.
	4097
	4098	There is no way to set the value of the entire rule with a mid-rule
	4099	action, because assignments to @code{$$} do not have that effect. The
	4100	only way to set the value for the entire rule is with an ordinary action
	4101	at the end of the rule.
	4102
	4103	Here is an example from a hypothetical compiler, handling a @code{let}
	4104	statement that looks like @samp{let (@var{variable}) @var{statement}} and
	4105	serves to create a variable named @var{variable} temporarily for the
	4106	duration of @var{statement}. To parse this construct, we must put
	4107	@var{variable} into the symbol table while @var{statement} is parsed, then
	4108	remove it afterward. Here is how it is done:
	4109
	4110	@example
	4111	@group
	4112	stmt:
	4113	"let" '(' var ')'
	4114	@{
	4115	$<context>$ = push_context ();
	4116	declare_variable ($3);
	4117	@}
	4118	stmt
	4119	@{
	4120	$$ = $6;
	4121	pop_context ($<context>5);
	4122	@}
	4123	@end group
	4124	@end example
	4125
	4126	@noindent
	4127	As soon as @samp{let (@var{variable})} has been recognized, the first
	4128	action is run. It saves a copy of the current semantic context (the
	4129	list of accessible variables) as its semantic value, using alternative
	4130	@code{context} in the data-type union. Then it calls
	4131	@code{declare_variable} to add the new variable to that list. Once the
	4132	first action is finished, the embedded statement @code{stmt} can be
	4133	parsed.
	4134
	4135	Note that the mid-rule action is component number 5, so the @samp{stmt} is
	4136	component number 6. Named references can be used to improve the readability
	4137	and maintainability (@pxref{Named References}):
	4138
	4139	@example
	4140	@group
	4141	stmt:
	4142	"let" '(' var ')'
	4143	@{
	4144	$<context>let = push_context ();
	4145	declare_variable ($3);
	4146	@}[let]
	4147	stmt
	4148	@{
	4149	$$ = $6;
	4150	pop_context ($<context>let);
	4151	@}
	4152	@end group
	4153	@end example
	4154
	4155	After the embedded statement is parsed, its semantic value becomes the
	4156	value of the entire @code{let}-statement. Then the semantic value from the
	4157	earlier action is used to restore the prior list of variables. This
	4158	removes the temporary @code{let}-variable from the list so that it won't
	4159	appear to exist while the rest of the program is parsed.
	4160
	4161	@findex %destructor
	4162	@cindex discarded symbols, mid-rule actions
	4163	@cindex error recovery, mid-rule actions
	4164	In the above example, if the parser initiates error recovery (@pxref{Error
	4165	Recovery}) while parsing the tokens in the embedded statement @code{stmt},
	4166	it might discard the previous semantic context @code{$<context>5} without
	4167	restoring it.
	4168	Thus, @code{$<context>5} needs a destructor (@pxref{Destructor Decl, , Freeing
	4169	Discarded Symbols}).
	4170	However, Bison currently provides no means to declare a destructor specific to
	4171	a particular mid-rule action's semantic value.
	4172
	4173	One solution is to bury the mid-rule action inside a nonterminal symbol and to
	4174	declare a destructor for that symbol:
	4175
	4176	@example
	4177	@group
	4178	%type <context> let
	4179	%destructor @{ pop_context ($$); @} let
	4180	@end group
	4181
	4182	%%
	4183
	4184	@group
	4185	stmt:
	4186	let stmt
	4187	@{
	4188	$$ = $2;
	4189	pop_context ($let);
	4190	@};
	4191	@end group
	4192
	4193	@group
	4194	let:
	4195	"let" '(' var ')'
	4196	@{
	4197	$let = push_context ();
	4198	declare_variable ($3);
	4199	@};
	4200
	4201	@end group
	4202	@end example
	4203
	4204	@noindent
	4205	Note that the action is now at the end of its rule.
	4206	Any mid-rule action can be converted to an end-of-rule action in this way, and
	4207	this is what Bison actually does to implement mid-rule actions.
	4208
	4209	@node Mid-Rule Action Translation
	4210	@subsubsection Mid-Rule Action Translation
	4211	@vindex $@@@var{n}
	4212	@vindex @@@var{n}
	4213
	4214	As hinted earlier, mid-rule actions are actually transformed into regular
	4215	rules and actions. The various reports generated by Bison (textual,
	4216	graphical, etc., see @ref{Understanding, , Understanding Your Parser})
	4217	reveal this translation, best explained by means of an example. The
	4218	following rule:
	4219
	4220	@example
	4221	exp: @{ a(); @} "b" @{ c(); @} @{ d(); @} "e" @{ f(); @};
	4222	@end example
	4223
	4224	@noindent
	4225	is translated into:
	4226
	4227	@example
	4228	$@@1: %empty @{ a(); @};
	4229	$@@2: %empty @{ c(); @};
	4230	$@@3: %empty @{ d(); @};
	4231	exp: $@@1 "b" $@@2 $@@3 "e" @{ f(); @};
	4232	@end example
	4233
	4234	@noindent
	4235	with new nonterminal symbols @code{$@@@var{n}}, where @var{n} is a number.
	4236
	4237	A mid-rule action is expected to generate a value if it uses @code{$$}, or
	4238	the (final) action uses @code{$@var{n}} where @var{n} denote the mid-rule
	4239	action. In that case its nonterminal is rather named @code{@@@var{n}}:
	4240
	4241	@example
	4242	exp: @{ a(); @} "b" @{ $$ = c(); @} @{ d(); @} "e" @{ f = $1; @};
	4243	@end example
	4244
	4245	@noindent
	4246	is translated into
	4247
	4248	@example
	4249	@@1: %empty @{ a(); @};
	4250	@@2: %empty @{ $$ = c(); @};
	4251	$@@3: %empty @{ d(); @};
	4252	exp: @@1 "b" @@2 $@@3 "e" @{ f = $1; @}
	4253	@end example
	4254
	4255	There are probably two errors in the above example: the first mid-rule
	4256	action does not generate a value (it does not use @code{$$} although the
	4257	final action uses it), and the value of the second one is not used (the
	4258	final action does not use @code{$3}). Bison reports these errors when the
	4259	@code{midrule-value} warnings are enabled (@pxref{Invocation, ,Invoking
	4260	Bison}):
	4261
	4262	@example
	4263	$ bison -fcaret -Wmidrule-value mid.y
	4264	@group
	4265	mid.y:2.6-13: warning: unset value: $$
	4266	exp: @{ a(); @} "b" @{ $$ = c(); @} @{ d(); @} "e" @{ f = $1; @};
	4267	^^^^^^^^
	4268	@end group
	4269	@group
	4270	mid.y:2.19-31: warning: unused value: $3
	4271	exp: @{ a(); @} "b" @{ $$ = c(); @} @{ d(); @} "e" @{ f = $1; @};
	4272	^^^^^^^^^^^^^
	4273	@end group
	4274	@end example
	4275
	4276
	4277	@node Mid-Rule Conflicts
	4278	@subsubsection Conflicts due to Mid-Rule Actions
	4279	Taking action before a rule is completely recognized often leads to
	4280	conflicts since the parser must commit to a parse in order to execute the
	4281	action. For example, the following two rules, without mid-rule actions,
	4282	can coexist in a working parser because the parser can shift the open-brace
	4283	token and look at what follows before deciding whether there is a
	4284	declaration or not:
	4285
	4286	@example
	4287	@group
	4288	compound:
	4289	'@{' declarations statements '@}'
	4290	\| '@{' statements '@}'
	4291	;
	4292	@end group
	4293	@end example
	4294
	4295	@noindent
	4296	But when we add a mid-rule action as follows, the rules become nonfunctional:
	4297
	4298	@example
	4299	@group
	4300	compound:
	4301	@{ prepare_for_local_variables (); @}
	4302	'@{' declarations statements '@}'
	4303	@end group
	4304	@group
	4305	\| '@{' statements '@}'
	4306	;
	4307	@end group
	4308	@end example
	4309
	4310	@noindent
	4311	Now the parser is forced to decide whether to run the mid-rule action
	4312	when it has read no farther than the open-brace. In other words, it
	4313	must commit to using one rule or the other, without sufficient
	4314	information to do it correctly. (The open-brace token is what is called
	4315	the @dfn{lookahead} token at this time, since the parser is still
	4316	deciding what to do about it. @xref{Lookahead, ,Lookahead Tokens}.)
	4317
	4318	You might think that you could correct the problem by putting identical
	4319	actions into the two rules, like this:
	4320
	4321	@example
	4322	@group
	4323	compound:
	4324	@{ prepare_for_local_variables (); @}
	4325	'@{' declarations statements '@}'
	4326	\| @{ prepare_for_local_variables (); @}
	4327	'@{' statements '@}'
	4328	;
	4329	@end group
	4330	@end example
	4331
	4332	@noindent
	4333	But this does not help, because Bison does not realize that the two actions
	4334	are identical. (Bison never tries to understand the C code in an action.)
	4335
	4336	If the grammar is such that a declaration can be distinguished from a
	4337	statement by the first token (which is true in C), then one solution which
	4338	does work is to put the action after the open-brace, like this:
	4339
	4340	@example
	4341	@group
	4342	compound:
	4343	'@{' @{ prepare_for_local_variables (); @}
	4344	declarations statements '@}'
	4345	\| '@{' statements '@}'
	4346	;
	4347	@end group
	4348	@end example
	4349
	4350	@noindent
	4351	Now the first token of the following declaration or statement,
	4352	which would in any case tell Bison which rule to use, can still do so.
	4353
	4354	Another solution is to bury the action inside a nonterminal symbol which
	4355	serves as a subroutine:
	4356
	4357	@example
	4358	@group
	4359	subroutine:
	4360	%empty @{ prepare_for_local_variables (); @}
	4361	;
	4362	@end group
	4363
	4364	@group
	4365	compound:
	4366	subroutine '@{' declarations statements '@}'
	4367	\| subroutine '@{' statements '@}'
	4368	;
	4369	@end group
	4370	@end example
	4371
	4372	@noindent
	4373	Now Bison can execute the action in the rule for @code{subroutine} without
	4374	deciding which rule for @code{compound} it will eventually use.
	4375
	4376
	4377	@node Tracking Locations
	4378	@section Tracking Locations
	4379	@cindex location
	4380	@cindex textual location
	4381	@cindex location, textual
	4382
	4383	Though grammar rules and semantic actions are enough to write a fully
	4384	functional parser, it can be useful to process some additional information,
	4385	especially symbol locations.
	4386
	4387	The way locations are handled is defined by providing a data type, and
	4388	actions to take when rules are matched.
	4389
	4390	@menu
	4391	* Location Type:: Specifying a data type for locations.
	4392	* Actions and Locations:: Using locations in actions.
	4393	* Location Default Action:: Defining a general way to compute locations.
	4394	@end menu
	4395
	4396	@node Location Type
	4397	@subsection Data Type of Locations
	4398	@cindex data type of locations
	4399	@cindex default location type
	4400
	4401	Defining a data type for locations is much simpler than for semantic values,
	4402	since all tokens and groupings always use the same type.
	4403
	4404	You can specify the type of locations by defining a macro called
	4405	@code{YYLTYPE}, just as you can specify the semantic value type by
	4406	defining a @code{YYSTYPE} macro (@pxref{Value Type}).
	4407	When @code{YYLTYPE} is not defined, Bison uses a default structure type with
	4408	four members:
	4409
	4410	@example
	4411	typedef struct YYLTYPE
	4412	@{
	4413	int first_line;
	4414	int first_column;
	4415	int last_line;
	4416	int last_column;
	4417	@} YYLTYPE;
	4418	@end example
	4419
	4420	When @code{YYLTYPE} is not defined, at the beginning of the parsing, Bison
	4421	initializes all these fields to 1 for @code{yylloc}. To initialize
	4422	@code{yylloc} with a custom location type (or to chose a different
	4423	initialization), use the @code{%initial-action} directive. @xref{Initial
	4424	Action Decl, , Performing Actions before Parsing}.
	4425
	4426	@node Actions and Locations
	4427	@subsection Actions and Locations
	4428	@cindex location actions
	4429	@cindex actions, location
	4430	@vindex @@$
	4431	@vindex @@@var{n}
	4432	@vindex @@@var{name}
	4433	@vindex @@[@var{name}]
	4434
	4435	Actions are not only useful for defining language semantics, but also for
	4436	describing the behavior of the output parser with locations.
	4437
	4438	The most obvious way for building locations of syntactic groupings is very
	4439	similar to the way semantic values are computed. In a given rule, several
	4440	constructs can be used to access the locations of the elements being matched.
	4441	The location of the @var{n}th component of the right hand side is
	4442	@code{@@@var{n}}, while the location of the left hand side grouping is
	4443	@code{@@$}.
	4444
	4445	In addition, the named references construct @code{@@@var{name}} and
	4446	@code{@@[@var{name}]} may also be used to address the symbol locations.
	4447	@xref{Named References}, for more information about using the named
	4448	references construct.
	4449
	4450	Here is a basic example using the default data type for locations:
	4451
	4452	@example
	4453	@group
	4454	exp:
	4455	@dots{}
	4456	\| exp '/' exp
	4457	@{
	4458	@@$.first_column = @@1.first_column;
	4459	@@$.first_line = @@1.first_line;
	4460	@@$.last_column = @@3.last_column;
	4461	@@$.last_line = @@3.last_line;
	4462	if ($3)
	4463	$$ = $1 / $3;
	4464	else
	4465	@{
	4466	$$ = 1;
	4467	fprintf (stderr, "%d.%d-%d.%d: division by zero",
	4468	@@3.first_line, @@3.first_column,
	4469	@@3.last_line, @@3.last_column);
	4470	@}
	4471	@}
	4472	@end group
	4473	@end example
	4474
	4475	As for semantic values, there is a default action for locations that is
	4476	run each time a rule is matched. It sets the beginning of @code{@@$} to the
	4477	beginning of the first symbol, and the end of @code{@@$} to the end of the
	4478	last symbol.
	4479
	4480	With this default action, the location tracking can be fully automatic. The
	4481	example above simply rewrites this way:
	4482
	4483	@example
	4484	@group
	4485	exp:
	4486	@dots{}
	4487	\| exp '/' exp
	4488	@{
	4489	if ($3)
	4490	$$ = $1 / $3;
	4491	else
	4492	@{
	4493	$$ = 1;
	4494	fprintf (stderr, "%d.%d-%d.%d: division by zero",
	4495	@@3.first_line, @@3.first_column,
	4496	@@3.last_line, @@3.last_column);
	4497	@}
	4498	@}
	4499	@end group
	4500	@end example
	4501
	4502	@vindex yylloc
	4503	It is also possible to access the location of the lookahead token, if any,
	4504	from a semantic action.
	4505	This location is stored in @code{yylloc}.
	4506	@xref{Action Features, ,Special Features for Use in Actions}.
	4507
	4508	@node Location Default Action
	4509	@subsection Default Action for Locations
	4510	@vindex YYLLOC_DEFAULT
	4511	@cindex GLR parsers and @code{YYLLOC_DEFAULT}
	4512
	4513	Actually, actions are not the best place to compute locations. Since
	4514	locations are much more general than semantic values, there is room in
	4515	the output parser to redefine the default action to take for each
	4516	rule. The @code{YYLLOC_DEFAULT} macro is invoked each time a rule is
	4517	matched, before the associated action is run. It is also invoked
	4518	while processing a syntax error, to compute the error's location.
	4519	Before reporting an unresolvable syntactic ambiguity, a GLR
	4520	parser invokes @code{YYLLOC_DEFAULT} recursively to compute the location
	4521	of that ambiguity.
	4522
	4523	Most of the time, this macro is general enough to suppress location
	4524	dedicated code from semantic actions.
	4525
	4526	The @code{YYLLOC_DEFAULT} macro takes three parameters. The first one is
	4527	the location of the grouping (the result of the computation). When a
	4528	rule is matched, the second parameter identifies locations of
	4529	all right hand side elements of the rule being matched, and the third
	4530	parameter is the size of the rule's right hand side.
	4531	When a GLR parser reports an ambiguity, which of multiple candidate
	4532	right hand sides it passes to @code{YYLLOC_DEFAULT} is undefined.
	4533	When processing a syntax error, the second parameter identifies locations
	4534	of the symbols that were discarded during error processing, and the third
	4535	parameter is the number of discarded symbols.
	4536
	4537	By default, @code{YYLLOC_DEFAULT} is defined this way:
	4538
	4539	@example
	4540	@group
	4541	# define YYLLOC_DEFAULT(Cur, Rhs, N) \
	4542	do \
	4543	if (N) \
	4544	@{ \
	4545	(Cur).first_line = YYRHSLOC(Rhs, 1).first_line; \
	4546	(Cur).first_column = YYRHSLOC(Rhs, 1).first_column; \
	4547	(Cur).last_line = YYRHSLOC(Rhs, N).last_line; \
	4548	(Cur).last_column = YYRHSLOC(Rhs, N).last_column; \
	4549	@} \
	4550	else \
	4551	@{ \
	4552	(Cur).first_line = (Cur).last_line = \
	4553	YYRHSLOC(Rhs, 0).last_line; \
	4554	(Cur).first_column = (Cur).last_column = \
	4555	YYRHSLOC(Rhs, 0).last_column; \
	4556	@} \
	4557	while (0)
	4558	@end group
	4559	@end example
	4560
	4561	@noindent
	4562	where @code{YYRHSLOC (rhs, k)} is the location of the @var{k}th symbol
	4563	in @var{rhs} when @var{k} is positive, and the location of the symbol
	4564	just before the reduction when @var{k} and @var{n} are both zero.
	4565
	4566	When defining @code{YYLLOC_DEFAULT}, you should consider that:
	4567
	4568	@itemize @bullet
	4569	@item
	4570	All arguments are free of side-effects. However, only the first one (the
	4571	result) should be modified by @code{YYLLOC_DEFAULT}.
	4572
	4573	@item
	4574	For consistency with semantic actions, valid indexes within the
	4575	right hand side range from 1 to @var{n}. When @var{n} is zero, only 0 is a
	4576	valid index, and it refers to the symbol just before the reduction.
	4577	During error processing @var{n} is always positive.
	4578
	4579	@item
	4580	Your macro should parenthesize its arguments, if need be, since the
	4581	actual arguments may not be surrounded by parentheses. Also, your
	4582	macro should expand to something that can be used as a single
	4583	statement when it is followed by a semicolon.
	4584	@end itemize
	4585
	4586	@node Named References
	4587	@section Named References
	4588	@cindex named references
	4589
	4590	As described in the preceding sections, the traditional way to refer to any
	4591	semantic value or location is a @dfn{positional reference}, which takes the
	4592	form @code{$@var{n}}, @code{$$}, @code{@@@var{n}}, and @code{@@$}. However,
	4593	such a reference is not very descriptive. Moreover, if you later decide to
	4594	insert or remove symbols in the right-hand side of a grammar rule, the need
	4595	to renumber such references can be tedious and error-prone.
	4596
	4597	To avoid these issues, you can also refer to a semantic value or location
	4598	using a @dfn{named reference}. First of all, original symbol names may be
	4599	used as named references. For example:
	4600
	4601	@example
	4602	@group
	4603	invocation: op '(' args ')'
	4604	@{ $invocation = new_invocation ($op, $args, @@invocation); @}
	4605	@end group
	4606	@end example
	4607
	4608	@noindent
	4609	Positional and named references can be mixed arbitrarily. For example:
	4610
	4611	@example
	4612	@group
	4613	invocation: op '(' args ')'
	4614	@{ $$ = new_invocation ($op, $args, @@$); @}
	4615	@end group
	4616	@end example
	4617
	4618	@noindent
	4619	However, sometimes regular symbol names are not sufficient due to
	4620	ambiguities:
	4621
	4622	@example
	4623	@group
	4624	exp: exp '/' exp
	4625	@{ $exp = $exp / $exp; @} // $exp is ambiguous.
	4626
	4627	exp: exp '/' exp
	4628	@{ $$ = $1 / $exp; @} // One usage is ambiguous.
	4629
	4630	exp: exp '/' exp
	4631	@{ $$ = $1 / $3; @} // No error.
	4632	@end group
	4633	@end example
	4634
	4635	@noindent
	4636	When ambiguity occurs, explicitly declared names may be used for values and
	4637	locations. Explicit names are declared as a bracketed name after a symbol
	4638	appearance in rule definitions. For example:
	4639	@example
	4640	@group
	4641	exp[result]: exp[left] '/' exp[right]
	4642	@{ $result = $left / $right; @}
	4643	@end group
	4644	@end example
	4645
	4646	@noindent
	4647	In order to access a semantic value generated by a mid-rule action, an
	4648	explicit name may also be declared by putting a bracketed name after the
	4649	closing brace of the mid-rule action code:
	4650	@example
	4651	@group
	4652	exp[res]: exp[x] '+' @{$left = $x;@}[left] exp[right]
	4653	@{ $res = $left + $right; @}
	4654	@end group
	4655	@end example
	4656
	4657	@noindent
	4658
	4659	In references, in order to specify names containing dots and dashes, an explicit
	4660	bracketed syntax @code{$[name]} and @code{@@[name]} must be used:
	4661	@example
	4662	@group
	4663	if-stmt: "if" '(' expr ')' "then" then.stmt ';'
	4664	@{ $[if-stmt] = new_if_stmt ($expr, $[then.stmt]); @}
	4665	@end group
	4666	@end example
	4667
	4668	It often happens that named references are followed by a dot, dash or other
	4669	C punctuation marks and operators. By default, Bison will read
	4670	@samp{$name.suffix} as a reference to symbol value @code{$name} followed by
	4671	@samp{.suffix}, i.e., an access to the @code{suffix} field of the semantic
	4672	value. In order to force Bison to recognize @samp{name.suffix} in its
	4673	entirety as the name of a semantic value, the bracketed syntax
	4674	@samp{$[name.suffix]} must be used.
	4675
	4676	The named references feature is experimental. More user feedback will help
	4677	to stabilize it.
	4678
	4679	@node Declarations
	4680	@section Bison Declarations
	4681	@cindex declarations, Bison
	4682	@cindex Bison declarations
	4683
	4684	The @dfn{Bison declarations} section of a Bison grammar defines the symbols
	4685	used in formulating the grammar and the data types of semantic values.
	4686	@xref{Symbols}.
	4687
	4688	All token type names (but not single-character literal tokens such as
	4689	@code{'+'} and @code{'*'}) must be declared. Nonterminal symbols must be
	4690	declared if you need to specify which data type to use for the semantic
	4691	value (@pxref{Multiple Types, ,More Than One Value Type}).
	4692
	4693	The first rule in the grammar file also specifies the start symbol, by
	4694	default. If you want some other symbol to be the start symbol, you
	4695	must declare it explicitly (@pxref{Language and Grammar, ,Languages
	4696	and Context-Free Grammars}).
	4697
	4698	@menu
	4699	* Require Decl:: Requiring a Bison version.
	4700	* Token Decl:: Declaring terminal symbols.
	4701	* Precedence Decl:: Declaring terminals with precedence and associativity.
	4702	* Type Decl:: Declaring the choice of type for a nonterminal symbol.
	4703	* Initial Action Decl:: Code run before parsing starts.
	4704	* Destructor Decl:: Declaring how symbols are freed.
	4705	* Printer Decl:: Declaring how symbol values are displayed.
	4706	* Expect Decl:: Suppressing warnings about parsing conflicts.
	4707	* Start Decl:: Specifying the start symbol.
	4708	* Pure Decl:: Requesting a reentrant parser.
	4709	* Push Decl:: Requesting a push parser.
	4710	* Decl Summary:: Table of all Bison declarations.
	4711	* %define Summary:: Defining variables to adjust Bison's behavior.
	4712	* %code Summary:: Inserting code into the parser source.
	4713	@end menu
	4714
	4715	@node Require Decl
	4716	@subsection Require a Version of Bison
	4717	@cindex version requirement
	4718	@cindex requiring a version of Bison
	4719	@findex %require
	4720
	4721	You may require the minimum version of Bison to process the grammar. If
	4722	the requirement is not met, @command{bison} exits with an error (exit
	4723	status 63).
	4724
	4725	@example
	4726	%require "@var{version}"
	4727	@end example
	4728
	4729	@node Token Decl
	4730	@subsection Token Type Names
	4731	@cindex declaring token type names
	4732	@cindex token type names, declaring
	4733	@cindex declaring literal string tokens
	4734	@findex %token
	4735
	4736	The basic way to declare a token type name (terminal symbol) is as follows:
	4737
	4738	@example
	4739	%token @var{name}
	4740	@end example
	4741
	4742	Bison will convert this into a @code{#define} directive in
	4743	the parser, so that the function @code{yylex} (if it is in this file)
	4744	can use the name @var{name} to stand for this token type's code.
	4745
	4746	Alternatively, you can use @code{%left}, @code{%right},
	4747	@code{%precedence}, or
	4748	@code{%nonassoc} instead of @code{%token}, if you wish to specify
	4749	associativity and precedence. @xref{Precedence Decl, ,Operator
	4750	Precedence}.
	4751
	4752	You can explicitly specify the numeric code for a token type by appending
	4753	a nonnegative decimal or hexadecimal integer value in the field immediately
	4754	following the token name:
	4755
	4756	@example
	4757	%token NUM 300
	4758	%token XNUM 0x12d // a GNU extension
	4759	@end example
	4760
	4761	@noindent
	4762	It is generally best, however, to let Bison choose the numeric codes for
	4763	all token types. Bison will automatically select codes that don't conflict
	4764	with each other or with normal characters.
	4765
	4766	In the event that the stack type is a union, you must augment the
	4767	@code{%token} or other token declaration to include the data type
	4768	alternative delimited by angle-brackets (@pxref{Multiple Types, ,More
	4769	Than One Value Type}).
	4770
	4771	For example:
	4772
	4773	@example
	4774	@group
	4775	%union @{ /* define stack type */
	4776	double val;
	4777	symrec *tptr;
	4778	@}
	4779	%token <val> NUM /* define token NUM and its type */
	4780	@end group
	4781	@end example
	4782
	4783	You can associate a literal string token with a token type name by
	4784	writing the literal string at the end of a @code{%token}
	4785	declaration which declares the name. For example:
	4786
	4787	@example
	4788	%token arrow "=>"
	4789	@end example
	4790
	4791	@noindent
	4792	For example, a grammar for the C language might specify these names with
	4793	equivalent literal string tokens:
	4794
	4795	@example
	4796	%token <operator> OR "\|\|"
	4797	%token <operator> LE 134 "<="
	4798	%left OR "<="
	4799	@end example
	4800
	4801	@noindent
	4802	Once you equate the literal string and the token name, you can use them
	4803	interchangeably in further declarations or the grammar rules. The
	4804	@code{yylex} function can use the token name or the literal string to
	4805	obtain the token type code number (@pxref{Calling Convention}).
	4806	Syntax error messages passed to @code{yyerror} from the parser will reference
	4807	the literal string instead of the token name.
	4808
	4809	The token numbered as 0 corresponds to end of file; the following line
	4810	allows for nicer error messages referring to ``end of file'' instead
	4811	of ``$end'':
	4812
	4813	@example
	4814	%token END 0 "end of file"
	4815	@end example
	4816
	4817	@node Precedence Decl
	4818	@subsection Operator Precedence
	4819	@cindex precedence declarations
	4820	@cindex declaring operator precedence
	4821	@cindex operator precedence, declaring
	4822
	4823	Use the @code{%left}, @code{%right}, @code{%nonassoc}, or
	4824	@code{%precedence} declaration to
	4825	declare a token and specify its precedence and associativity, all at
	4826	once. These are called @dfn{precedence declarations}.
	4827	@xref{Precedence, ,Operator Precedence}, for general information on
	4828	operator precedence.
	4829
	4830	The syntax of a precedence declaration is nearly the same as that of
	4831	@code{%token}: either
	4832
	4833	@example
	4834	%left @var{symbols}@dots{}
	4835	@end example
	4836
	4837	@noindent
	4838	or
	4839
	4840	@example
	4841	%left <@var{type}> @var{symbols}@dots{}
	4842	@end example
	4843
	4844	And indeed any of these declarations serves the purposes of @code{%token}.
	4845	But in addition, they specify the associativity and relative precedence for
	4846	all the @var{symbols}:
	4847
	4848	@itemize @bullet
	4849	@item
	4850	The associativity of an operator @var{op} determines how repeated uses
	4851	of the operator nest: whether @samp{@var{x} @var{op} @var{y} @var{op}
	4852	@var{z}} is parsed by grouping @var{x} with @var{y} first or by
	4853	grouping @var{y} with @var{z} first. @code{%left} specifies
	4854	left-associativity (grouping @var{x} with @var{y} first) and
	4855	@code{%right} specifies right-associativity (grouping @var{y} with
	4856	@var{z} first). @code{%nonassoc} specifies no associativity, which
	4857	means that @samp{@var{x} @var{op} @var{y} @var{op} @var{z}} is
	4858	considered a syntax error.
	4859
	4860	@code{%precedence} gives only precedence to the @var{symbols}, and
	4861	defines no associativity at all. Use this to define precedence only,
	4862	and leave any potential conflict due to associativity enabled.
	4863
	4864	@item
	4865	The precedence of an operator determines how it nests with other operators.
	4866	All the tokens declared in a single precedence declaration have equal
	4867	precedence and nest together according to their associativity.
	4868	When two tokens declared in different precedence declarations associate,
	4869	the one declared later has the higher precedence and is grouped first.
	4870	@end itemize
	4871
	4872	For backward compatibility, there is a confusing difference between the
	4873	argument lists of @code{%token} and precedence declarations.
	4874	Only a @code{%token} can associate a literal string with a token type name.
	4875	A precedence declaration always interprets a literal string as a reference to a
	4876	separate token.
	4877	For example:
	4878
	4879	@example
	4880	%left OR "<=" // Does not declare an alias.
	4881	%left OR 134 "<=" 135 // Declares 134 for OR and 135 for "<=".
	4882	@end example
	4883
	4884	@node Type Decl
	4885	@subsection Nonterminal Symbols
	4886	@cindex declaring value types, nonterminals
	4887	@cindex value types, nonterminals, declaring
	4888	@findex %type
	4889
	4890	@noindent
	4891	When you use @code{%union} to specify multiple value types, you must
	4892	declare the value type of each nonterminal symbol for which values are
	4893	used. This is done with a @code{%type} declaration, like this:
	4894
	4895	@example
	4896	%type <@var{type}> @var{nonterminal}@dots{}
	4897	@end example
	4898
	4899	@noindent
	4900	Here @var{nonterminal} is the name of a nonterminal symbol, and
	4901	@var{type} is the name given in the @code{%union} to the alternative
	4902	that you want (@pxref{Union Decl, ,The Union Declaration}). You
	4903	can give any number of nonterminal symbols in the same @code{%type}
	4904	declaration, if they have the same value type. Use spaces to separate
	4905	the symbol names.
	4906
	4907	You can also declare the value type of a terminal symbol. To do this,
	4908	use the same @code{<@var{type}>} construction in a declaration for the
	4909	terminal symbol. All kinds of token declarations allow
	4910	@code{<@var{type}>}.
	4911
	4912	@node Initial Action Decl
	4913	@subsection Performing Actions before Parsing
	4914	@findex %initial-action
	4915
	4916	Sometimes your parser needs to perform some initializations before
	4917	parsing. The @code{%initial-action} directive allows for such arbitrary
	4918	code.
	4919
	4920	@deffn {Directive} %initial-action @{ @var{code} @}
	4921	@findex %initial-action
	4922	Declare that the braced @var{code} must be invoked before parsing each time
	4923	@code{yyparse} is called. The @var{code} may use @code{$$} (or
	4924	@code{$<@var{tag}>$}) and @code{@@$} --- initial value and location of the
	4925	lookahead --- and the @code{%parse-param}.
	4926	@end deffn
	4927
	4928	For instance, if your locations use a file name, you may use
	4929
	4930	@example
	4931	%parse-param @{ char const *file_name @};
	4932	%initial-action
	4933	@{
	4934	@@$.initialize (file_name);
	4935	@};
	4936	@end example
	4937
	4938
	4939	@node Destructor Decl
	4940	@subsection Freeing Discarded Symbols
	4941	@cindex freeing discarded symbols
	4942	@findex %destructor
	4943	@findex <*>
	4944	@findex <>
	4945	During error recovery (@pxref{Error Recovery}), symbols already pushed
	4946	on the stack and tokens coming from the rest of the file are discarded
	4947	until the parser falls on its feet. If the parser runs out of memory,
	4948	or if it returns via @code{YYABORT} or @code{YYACCEPT}, all the
	4949	symbols on the stack must be discarded. Even if the parser succeeds, it
	4950	must discard the start symbol.
	4951
	4952	When discarded symbols convey heap based information, this memory is
	4953	lost. While this behavior can be tolerable for batch parsers, such as
	4954	in traditional compilers, it is unacceptable for programs like shells or
	4955	protocol implementations that may parse and execute indefinitely.
	4956
	4957	The @code{%destructor} directive defines code that is called when a
	4958	symbol is automatically discarded.
	4959
	4960	@deffn {Directive} %destructor @{ @var{code} @} @var{symbols}
	4961	@findex %destructor
	4962	Invoke the braced @var{code} whenever the parser discards one of the
	4963	@var{symbols}. Within @var{code}, @code{$$} (or @code{$<@var{tag}>$})
	4964	designates the semantic value associated with the discarded symbol, and
	4965	@code{@@$} designates its location. The additional parser parameters are
	4966	also available (@pxref{Parser Function, , The Parser Function
	4967	@code{yyparse}}).
	4968
	4969	When a symbol is listed among @var{symbols}, its @code{%destructor} is called a
	4970	per-symbol @code{%destructor}.
	4971	You may also define a per-type @code{%destructor} by listing a semantic type
	4972	tag among @var{symbols}.
	4973	In that case, the parser will invoke this @var{code} whenever it discards any
	4974	grammar symbol that has that semantic type tag unless that symbol has its own
	4975	per-symbol @code{%destructor}.
	4976
	4977	Finally, you can define two different kinds of default @code{%destructor}s.
	4978	(These default forms are experimental.
	4979	More user feedback will help to determine whether they should become permanent
	4980	features.)
	4981	You can place each of @code{<*>} and @code{<>} in the @var{symbols} list of
	4982	exactly one @code{%destructor} declaration in your grammar file.
	4983	The parser will invoke the @var{code} associated with one of these whenever it
	4984	discards any user-defined grammar symbol that has no per-symbol and no per-type
	4985	@code{%destructor}.
	4986	The parser uses the @var{code} for @code{<*>} in the case of such a grammar
	4987	symbol for which you have formally declared a semantic type tag (@code{%type}
	4988	counts as such a declaration, but @code{$<tag>$} does not).
	4989	The parser uses the @var{code} for @code{<>} in the case of such a grammar
	4990	symbol that has no declared semantic type tag.
	4991	@end deffn
	4992
	4993	@noindent
	4994	For example:
	4995
	4996	@example
	4997	%union @{ char *string; @}
	4998	%token <string> STRING1 STRING2
	4999	%type <string> string1 string2
	5000	%union @{ char character; @}
	5001	%token <character> CHR
	5002	%type <character> chr
	5003	%token TAGLESS
	5004
	5005	%destructor @{ @} <character>
	5006	%destructor @{ free ($$); @} <*>
	5007	%destructor @{ free ($$); printf ("%d", @@$.first_line); @} STRING1 string1
	5008	%destructor @{ printf ("Discarding tagless symbol.\n"); @} <>
	5009	@end example
	5010
	5011	@noindent
	5012	guarantees that, when the parser discards any user-defined symbol that has a
	5013	semantic type tag other than @code{<character>}, it passes its semantic value
	5014	to @code{free} by default.
	5015	However, when the parser discards a @code{STRING1} or a @code{string1}, it also
	5016	prints its line number to @code{stdout}.
	5017	It performs only the second @code{%destructor} in this case, so it invokes
	5018	@code{free} only once.
	5019	Finally, the parser merely prints a message whenever it discards any symbol,
	5020	such as @code{TAGLESS}, that has no semantic type tag.
	5021
	5022	A Bison-generated parser invokes the default @code{%destructor}s only for
	5023	user-defined as opposed to Bison-defined symbols.
	5024	For example, the parser will not invoke either kind of default
	5025	@code{%destructor} for the special Bison-defined symbols @code{$accept},
	5026	@code{$undefined}, or @code{$end} (@pxref{Table of Symbols, ,Bison Symbols}),
	5027	none of which you can reference in your grammar.
	5028	It also will not invoke either for the @code{error} token (@pxref{Table of
	5029	Symbols, ,error}), which is always defined by Bison regardless of whether you
	5030	reference it in your grammar.
	5031	However, it may invoke one of them for the end token (token 0) if you
	5032	redefine it from @code{$end} to, for example, @code{END}:
	5033
	5034	@example
	5035	%token END 0
	5036	@end example
	5037
	5038	@cindex actions in mid-rule
	5039	@cindex mid-rule actions
	5040	Finally, Bison will never invoke a @code{%destructor} for an unreferenced
	5041	mid-rule semantic value (@pxref{Mid-Rule Actions,,Actions in Mid-Rule}).
	5042	That is, Bison does not consider a mid-rule to have a semantic value if you
	5043	do not reference @code{$$} in the mid-rule's action or @code{$@var{n}}
	5044	(where @var{n} is the right-hand side symbol position of the mid-rule) in
	5045	any later action in that rule. However, if you do reference either, the
	5046	Bison-generated parser will invoke the @code{<>} @code{%destructor} whenever
	5047	it discards the mid-rule symbol.
	5048
	5049	@ignore
	5050	@noindent
	5051	In the future, it may be possible to redefine the @code{error} token as a
	5052	nonterminal that captures the discarded symbols.
	5053	In that case, the parser will invoke the default destructor for it as well.
	5054	@end ignore
	5055
	5056	@sp 1
	5057
	5058	@cindex discarded symbols
	5059	@dfn{Discarded symbols} are the following:
	5060
	5061	@itemize
	5062	@item
	5063	stacked symbols popped during the first phase of error recovery,
	5064	@item
	5065	incoming terminals during the second phase of error recovery,
	5066	@item
	5067	the current lookahead and the entire stack (except the current
	5068	right-hand side symbols) when the parser returns immediately, and
	5069	@item
	5070	the current lookahead and the entire stack (including the current right-hand
	5071	side symbols) when the C++ parser (@file{lalr1.cc}) catches an exception in
	5072	@code{parse},
	5073	@item
	5074	the start symbol, when the parser succeeds.
	5075	@end itemize
	5076
	5077	The parser can @dfn{return immediately} because of an explicit call to
	5078	@code{YYABORT} or @code{YYACCEPT}, or failed error recovery, or memory
	5079	exhaustion.
	5080
	5081	Right-hand side symbols of a rule that explicitly triggers a syntax
	5082	error via @code{YYERROR} are not discarded automatically. As a rule
	5083	of thumb, destructors are invoked only when user actions cannot manage
	5084	the memory.
	5085
	5086	@node Printer Decl
	5087	@subsection Printing Semantic Values
	5088	@cindex printing semantic values
	5089	@findex %printer
	5090	@findex <*>
	5091	@findex <>
	5092	When run-time traces are enabled (@pxref{Tracing, ,Tracing Your Parser}),
	5093	the parser reports its actions, such as reductions. When a symbol involved
	5094	in an action is reported, only its kind is displayed, as the parser cannot
	5095	know how semantic values should be formatted.
	5096
	5097	The @code{%printer} directive defines code that is called when a symbol is
	5098	reported. Its syntax is the same as @code{%destructor} (@pxref{Destructor
	5099	Decl, , Freeing Discarded Symbols}).
	5100
	5101	@deffn {Directive} %printer @{ @var{code} @} @var{symbols}
	5102	@findex %printer
	5103	@vindex yyoutput
	5104	@c This is the same text as for %destructor.
	5105	Invoke the braced @var{code} whenever the parser displays one of the
	5106	@var{symbols}. Within @var{code}, @code{yyoutput} denotes the output stream
	5107	(a @code{FILE*} in C, and an @code{std::ostream&} in C++), @code{$$} (or
	5108	@code{$<@var{tag}>$}) designates the semantic value associated with the
	5109	symbol, and @code{@@$} its location. The additional parser parameters are
	5110	also available (@pxref{Parser Function, , The Parser Function
	5111	@code{yyparse}}).
	5112
	5113	The @var{symbols} are defined as for @code{%destructor} (@pxref{Destructor
	5114	Decl, , Freeing Discarded Symbols}.): they can be per-type (e.g.,
	5115	@samp{<ival>}), per-symbol (e.g., @samp{exp}, @samp{NUM}, @samp{"float"}),
	5116	typed per-default (i.e., @samp{<*>}, or untyped per-default (i.e.,
	5117	@samp{<>}).
	5118	@end deffn
	5119
	5120	@noindent
	5121	For example:
	5122
	5123	@example
	5124	%union @{ char *string; @}
	5125	%token <string> STRING1 STRING2
	5126	%type <string> string1 string2
	5127	%union @{ char character; @}
	5128	%token <character> CHR
	5129	%type <character> chr
	5130	%token TAGLESS
	5131
	5132	%printer @{ fprintf (yyoutput, "'%c'", $$); @} <character>
	5133	%printer @{ fprintf (yyoutput, "&%p", $$); @} <*>
	5134	%printer @{ fprintf (yyoutput, "\"%s\"", $$); @} STRING1 string1
	5135	%printer @{ fprintf (yyoutput, "<>"); @} <>
	5136	@end example
	5137
	5138	@noindent
	5139	guarantees that, when the parser print any symbol that has a semantic type
	5140	tag other than @code{<character>}, it display the address of the semantic
	5141	value by default. However, when the parser displays a @code{STRING1} or a
	5142	@code{string1}, it formats it as a string in double quotes. It performs
	5143	only the second @code{%printer} in this case, so it prints only once.
	5144	Finally, the parser print @samp{<>} for any symbol, such as @code{TAGLESS},
	5145	that has no semantic type tag. See also
	5146
	5147
	5148	@node Expect Decl
	5149	@subsection Suppressing Conflict Warnings
	5150	@cindex suppressing conflict warnings
	5151	@cindex preventing warnings about conflicts
	5152	@cindex warnings, preventing
	5153	@cindex conflicts, suppressing warnings of
	5154	@findex %expect
	5155	@findex %expect-rr
	5156
	5157	Bison normally warns if there are any conflicts in the grammar
	5158	(@pxref{Shift/Reduce, ,Shift/Reduce Conflicts}), but most real grammars
	5159	have harmless shift/reduce conflicts which are resolved in a predictable
	5160	way and would be difficult to eliminate. It is desirable to suppress
	5161	the warning about these conflicts unless the number of conflicts
	5162	changes. You can do this with the @code{%expect} declaration.
	5163
	5164	The declaration looks like this:
	5165
	5166	@example
	5167	%expect @var{n}
	5168	@end example
	5169
	5170	Here @var{n} is a decimal integer. The declaration says there should
	5171	be @var{n} shift/reduce conflicts and no reduce/reduce conflicts.
	5172	Bison reports an error if the number of shift/reduce conflicts differs
	5173	from @var{n}, or if there are any reduce/reduce conflicts.
	5174
	5175	For deterministic parsers, reduce/reduce conflicts are more
	5176	serious, and should be eliminated entirely. Bison will always report
	5177	reduce/reduce conflicts for these parsers. With GLR
	5178	parsers, however, both kinds of conflicts are routine; otherwise,
	5179	there would be no need to use GLR parsing. Therefore, it is
	5180	also possible to specify an expected number of reduce/reduce conflicts
	5181	in GLR parsers, using the declaration:
	5182
	5183	@example
	5184	%expect-rr @var{n}
	5185	@end example
	5186
	5187	In general, using @code{%expect} involves these steps:
	5188
	5189	@itemize @bullet
	5190	@item
	5191	Compile your grammar without @code{%expect}. Use the @samp{-v} option
	5192	to get a verbose list of where the conflicts occur. Bison will also
	5193	print the number of conflicts.
	5194
	5195	@item
	5196	Check each of the conflicts to make sure that Bison's default
	5197	resolution is what you really want. If not, rewrite the grammar and
	5198	go back to the beginning.
	5199
	5200	@item
	5201	Add an @code{%expect} declaration, copying the number @var{n} from the
	5202	number which Bison printed. With GLR parsers, add an
	5203	@code{%expect-rr} declaration as well.
	5204	@end itemize
	5205
	5206	Now Bison will report an error if you introduce an unexpected conflict,
	5207	but will keep silent otherwise.
	5208
	5209	@node Start Decl
	5210	@subsection The Start-Symbol
	5211	@cindex declaring the start symbol
	5212	@cindex start symbol, declaring
	5213	@cindex default start symbol
	5214	@findex %start
	5215
	5216	Bison assumes by default that the start symbol for the grammar is the first
	5217	nonterminal specified in the grammar specification section. The programmer
	5218	may override this restriction with the @code{%start} declaration as follows:
	5219
	5220	@example
	5221	%start @var{symbol}
	5222	@end example
	5223
	5224	@node Pure Decl
	5225	@subsection A Pure (Reentrant) Parser
	5226	@cindex reentrant parser
	5227	@cindex pure parser
	5228	@findex %define api.pure
	5229
	5230	A @dfn{reentrant} program is one which does not alter in the course of
	5231	execution; in other words, it consists entirely of @dfn{pure} (read-only)
	5232	code. Reentrancy is important whenever asynchronous execution is possible;
	5233	for example, a nonreentrant program may not be safe to call from a signal
	5234	handler. In systems with multiple threads of control, a nonreentrant
	5235	program must be called only within interlocks.
	5236
	5237	Normally, Bison generates a parser which is not reentrant. This is
	5238	suitable for most uses, and it permits compatibility with Yacc. (The
	5239	standard Yacc interfaces are inherently nonreentrant, because they use
	5240	statically allocated variables for communication with @code{yylex},
	5241	including @code{yylval} and @code{yylloc}.)
	5242
	5243	Alternatively, you can generate a pure, reentrant parser. The Bison
	5244	declaration @samp{%define api.pure} says that you want the parser to be
	5245	reentrant. It looks like this:
	5246
	5247	@example
	5248	%define api.pure full
	5249	@end example
	5250
	5251	The result is that the communication variables @code{yylval} and
	5252	@code{yylloc} become local variables in @code{yyparse}, and a different
	5253	calling convention is used for the lexical analyzer function
	5254	@code{yylex}. @xref{Pure Calling, ,Calling Conventions for Pure
	5255	Parsers}, for the details of this. The variable @code{yynerrs}
	5256	becomes local in @code{yyparse} in pull mode but it becomes a member
	5257	of @code{yypstate} in push mode. (@pxref{Error Reporting, ,The Error
	5258	Reporting Function @code{yyerror}}). The convention for calling
	5259	@code{yyparse} itself is unchanged.
	5260
	5261	Whether the parser is pure has nothing to do with the grammar rules.
	5262	You can generate either a pure parser or a nonreentrant parser from any
	5263	valid grammar.
	5264
	5265	@node Push Decl
	5266	@subsection A Push Parser
	5267	@cindex push parser
	5268	@cindex push parser
	5269	@findex %define api.push-pull
	5270
	5271	(The current push parsing interface is experimental and may evolve.
	5272	More user feedback will help to stabilize it.)
	5273
	5274	A pull parser is called once and it takes control until all its input
	5275	is completely parsed. A push parser, on the other hand, is called
	5276	each time a new token is made available.
	5277
	5278	A push parser is typically useful when the parser is part of a
	5279	main event loop in the client's application. This is typically
	5280	a requirement of a GUI, when the main event loop needs to be triggered
	5281	within a certain time period.
	5282
	5283	Normally, Bison generates a pull parser.
	5284	The following Bison declaration says that you want the parser to be a push
	5285	parser (@pxref{%define Summary,,api.push-pull}):
	5286
	5287	@example
	5288	%define api.push-pull push
	5289	@end example
	5290
	5291	In almost all cases, you want to ensure that your push parser is also
	5292	a pure parser (@pxref{Pure Decl, ,A Pure (Reentrant) Parser}). The only
	5293	time you should create an impure push parser is to have backwards
	5294	compatibility with the impure Yacc pull mode interface. Unless you know
	5295	what you are doing, your declarations should look like this:
	5296
	5297	@example
	5298	%define api.pure full
	5299	%define api.push-pull push
	5300	@end example
	5301
	5302	There is a major notable functional difference between the pure push parser
	5303	and the impure push parser. It is acceptable for a pure push parser to have
	5304	many parser instances, of the same type of parser, in memory at the same time.
	5305	An impure push parser should only use one parser at a time.
	5306
	5307	When a push parser is selected, Bison will generate some new symbols in
	5308	the generated parser. @code{yypstate} is a structure that the generated
	5309	parser uses to store the parser's state. @code{yypstate_new} is the
	5310	function that will create a new parser instance. @code{yypstate_delete}
	5311	will free the resources associated with the corresponding parser instance.
	5312	Finally, @code{yypush_parse} is the function that should be called whenever a
	5313	token is available to provide the parser. A trivial example
	5314	of using a pure push parser would look like this:
	5315
	5316	@example
	5317	int status;
	5318	yypstate *ps = yypstate_new ();
	5319	do @{
	5320	status = yypush_parse (ps, yylex (), NULL);
	5321	@} while (status == YYPUSH_MORE);
	5322	yypstate_delete (ps);
	5323	@end example
	5324
	5325	If the user decided to use an impure push parser, a few things about
	5326	the generated parser will change. The @code{yychar} variable becomes
	5327	a global variable instead of a variable in the @code{yypush_parse} function.
	5328	For this reason, the signature of the @code{yypush_parse} function is
	5329	changed to remove the token as a parameter. A nonreentrant push parser
	5330	example would thus look like this:
	5331
	5332	@example
	5333	extern int yychar;
	5334	int status;
	5335	yypstate *ps = yypstate_new ();
	5336	do @{
	5337	yychar = yylex ();
	5338	status = yypush_parse (ps);
	5339	@} while (status == YYPUSH_MORE);
	5340	yypstate_delete (ps);
	5341	@end example
	5342
	5343	That's it. Notice the next token is put into the global variable @code{yychar}
	5344	for use by the next invocation of the @code{yypush_parse} function.
	5345
	5346	Bison also supports both the push parser interface along with the pull parser
	5347	interface in the same generated parser. In order to get this functionality,
	5348	you should replace the @samp{%define api.push-pull push} declaration with the
	5349	@samp{%define api.push-pull both} declaration. Doing this will create all of
	5350	the symbols mentioned earlier along with the two extra symbols, @code{yyparse}
	5351	and @code{yypull_parse}. @code{yyparse} can be used exactly as it normally
	5352	would be used. However, the user should note that it is implemented in the
	5353	generated parser by calling @code{yypull_parse}.
	5354	This makes the @code{yyparse} function that is generated with the
	5355	@samp{%define api.push-pull both} declaration slower than the normal
	5356	@code{yyparse} function. If the user
	5357	calls the @code{yypull_parse} function it will parse the rest of the input
	5358	stream. It is possible to @code{yypush_parse} tokens to select a subgrammar
	5359	and then @code{yypull_parse} the rest of the input stream. If you would like
	5360	to switch back and forth between between parsing styles, you would have to
	5361	write your own @code{yypull_parse} function that knows when to quit looking
	5362	for input. An example of using the @code{yypull_parse} function would look
	5363	like this:
	5364
	5365	@example
	5366	yypstate *ps = yypstate_new ();
	5367	yypull_parse (ps); /* Will call the lexer */
	5368	yypstate_delete (ps);
	5369	@end example
	5370
	5371	Adding the @samp{%define api.pure} declaration does exactly the same thing to
	5372	the generated parser with @samp{%define api.push-pull both} as it did for
	5373	@samp{%define api.push-pull push}.
	5374
	5375	@node Decl Summary
	5376	@subsection Bison Declaration Summary
	5377	@cindex Bison declaration summary
	5378	@cindex declaration summary
	5379	@cindex summary, Bison declaration
	5380
	5381	Here is a summary of the declarations used to define a grammar:
	5382
	5383	@deffn {Directive} %union
	5384	Declare the collection of data types that semantic values may have
	5385	(@pxref{Union Decl, ,The Union Declaration}).
	5386	@end deffn
	5387
	5388	@deffn {Directive} %token
	5389	Declare a terminal symbol (token type name) with no precedence
	5390	or associativity specified (@pxref{Token Decl, ,Token Type Names}).
	5391	@end deffn
	5392
	5393	@deffn {Directive} %right
	5394	Declare a terminal symbol (token type name) that is right-associative
	5395	(@pxref{Precedence Decl, ,Operator Precedence}).
	5396	@end deffn
	5397
	5398	@deffn {Directive} %left
	5399	Declare a terminal symbol (token type name) that is left-associative
	5400	(@pxref{Precedence Decl, ,Operator Precedence}).
	5401	@end deffn
	5402
	5403	@deffn {Directive} %nonassoc
	5404	Declare a terminal symbol (token type name) that is nonassociative
	5405	(@pxref{Precedence Decl, ,Operator Precedence}).
	5406	Using it in a way that would be associative is a syntax error.
	5407	@end deffn
	5408
	5409	@ifset defaultprec
	5410	@deffn {Directive} %default-prec
	5411	Assign a precedence to rules lacking an explicit @code{%prec} modifier
	5412	(@pxref{Contextual Precedence, ,Context-Dependent Precedence}).
	5413	@end deffn
	5414	@end ifset
	5415
	5416	@deffn {Directive} %type
	5417	Declare the type of semantic values for a nonterminal symbol
	5418	(@pxref{Type Decl, ,Nonterminal Symbols}).
	5419	@end deffn
	5420
	5421	@deffn {Directive} %start
	5422	Specify the grammar's start symbol (@pxref{Start Decl, ,The
	5423	Start-Symbol}).
	5424	@end deffn
	5425
	5426	@deffn {Directive} %expect
	5427	Declare the expected number of shift-reduce conflicts
	5428	(@pxref{Expect Decl, ,Suppressing Conflict Warnings}).
	5429	@end deffn
	5430
	5431
	5432	@sp 1
	5433	@noindent
	5434	In order to change the behavior of @command{bison}, use the following
	5435	directives:
	5436
	5437	@deffn {Directive} %code @{@var{code}@}
	5438	@deffnx {Directive} %code @var{qualifier} @{@var{code}@}
	5439	@findex %code
	5440	Insert @var{code} verbatim into the output parser source at the
	5441	default location or at the location specified by @var{qualifier}.
	5442	@xref{%code Summary}.
	5443	@end deffn
	5444
	5445	@deffn {Directive} %debug
	5446	Instrument the parser for traces. Obsoleted by @samp{%define
	5447	parse.trace}.
	5448	@xref{Tracing, ,Tracing Your Parser}.
	5449	@end deffn
	5450
	5451	@deffn {Directive} %define @var{variable}
	5452	@deffnx {Directive} %define @var{variable} @var{value}
	5453	@deffnx {Directive} %define @var{variable} @{@var{value}@}
	5454	@deffnx {Directive} %define @var{variable} "@var{value}"
	5455	Define a variable to adjust Bison's behavior. @xref{%define Summary}.
	5456	@end deffn
	5457
	5458	@deffn {Directive} %defines
	5459	Write a parser header file containing macro definitions for the token
	5460	type names defined in the grammar as well as a few other declarations.
	5461	If the parser implementation file is named @file{@var{name}.c} then
	5462	the parser header file is named @file{@var{name}.h}.
	5463
	5464	For C parsers, the parser header file declares @code{YYSTYPE} unless
	5465	@code{YYSTYPE} is already defined as a macro or you have used a
	5466	@code{<@var{type}>} tag without using @code{%union}. Therefore, if
	5467	you are using a @code{%union} (@pxref{Multiple Types, ,More Than One
	5468	Value Type}) with components that require other definitions, or if you
	5469	have defined a @code{YYSTYPE} macro or type definition (@pxref{Value
	5470	Type, ,Data Types of Semantic Values}), you need to arrange for these
	5471	definitions to be propagated to all modules, e.g., by putting them in
	5472	a prerequisite header that is included both by your parser and by any
	5473	other module that needs @code{YYSTYPE}.
	5474
	5475	Unless your parser is pure, the parser header file declares
	5476	@code{yylval} as an external variable. @xref{Pure Decl, ,A Pure
	5477	(Reentrant) Parser}.
	5478
	5479	If you have also used locations, the parser header file declares
	5480	@code{YYLTYPE} and @code{yylloc} using a protocol similar to that of the
	5481	@code{YYSTYPE} macro and @code{yylval}. @xref{Tracking Locations}.
	5482
	5483	This parser header file is normally essential if you wish to put the
	5484	definition of @code{yylex} in a separate source file, because
	5485	@code{yylex} typically needs to be able to refer to the
	5486	above-mentioned declarations and to the token type codes. @xref{Token
	5487	Values, ,Semantic Values of Tokens}.
	5488
	5489	@findex %code requires
	5490	@findex %code provides
	5491	If you have declared @code{%code requires} or @code{%code provides}, the output
	5492	header also contains their code.
	5493	@xref{%code Summary}.
	5494
	5495	@cindex Header guard
	5496	The generated header is protected against multiple inclusions with a C
	5497	preprocessor guard: @samp{YY_@var{PREFIX}_@var{FILE}_INCLUDED}, where
	5498	@var{PREFIX} and @var{FILE} are the prefix (@pxref{Multiple Parsers,
	5499	,Multiple Parsers in the Same Program}) and generated file name turned
	5500	uppercase, with each series of non alphanumerical characters converted to a
	5501	single underscore.
	5502
	5503	For instance with @samp{%define api.prefix @{calc@}} and @samp{%defines
	5504	"lib/parse.h"}, the header will be guarded as follows.
	5505	@example
	5506	#ifndef YY_CALC_LIB_PARSE_H_INCLUDED
	5507	# define YY_CALC_LIB_PARSE_H_INCLUDED
	5508	...
	5509	#endif /* ! YY_CALC_LIB_PARSE_H_INCLUDED */
	5510	@end example
	5511	@end deffn
	5512
	5513	@deffn {Directive} %defines @var{defines-file}
	5514	Same as above, but save in the file @file{@var{defines-file}}.
	5515	@end deffn
	5516
	5517	@deffn {Directive} %destructor
	5518	Specify how the parser should reclaim the memory associated to
	5519	discarded symbols. @xref{Destructor Decl, , Freeing Discarded Symbols}.
	5520	@end deffn
	5521
	5522	@deffn {Directive} %file-prefix "@var{prefix}"
	5523	Specify a prefix to use for all Bison output file names. The names
	5524	are chosen as if the grammar file were named @file{@var{prefix}.y}.
	5525	@end deffn
	5526
	5527	@deffn {Directive} %language "@var{language}"
	5528	Specify the programming language for the generated parser. Currently
	5529	supported languages include C, C++, and Java.
	5530	@var{language} is case-insensitive.
	5531
	5532	@end deffn
	5533
	5534	@deffn {Directive} %locations
	5535	Generate the code processing the locations (@pxref{Action Features,
	5536	,Special Features for Use in Actions}). This mode is enabled as soon as
	5537	the grammar uses the special @samp{@@@var{n}} tokens, but if your
	5538	grammar does not use it, using @samp{%locations} allows for more
	5539	accurate syntax error messages.
	5540	@end deffn
	5541
	5542	@deffn {Directive} %name-prefix "@var{prefix}"
	5543	Rename the external symbols used in the parser so that they start with
	5544	@var{prefix} instead of @samp{yy}. The precise list of symbols renamed
	5545	in C parsers
	5546	is @code{yyparse}, @code{yylex}, @code{yyerror}, @code{yynerrs},
	5547	@code{yylval}, @code{yychar}, @code{yydebug}, and
	5548	(if locations are used) @code{yylloc}. If you use a push parser,
	5549	@code{yypush_parse}, @code{yypull_parse}, @code{yypstate},
	5550	@code{yypstate_new} and @code{yypstate_delete} will
	5551	also be renamed. For example, if you use @samp{%name-prefix "c_"}, the
	5552	names become @code{c_parse}, @code{c_lex}, and so on.
	5553	For C++ parsers, see the @samp{%define api.namespace} documentation in this
	5554	section.
	5555	@xref{Multiple Parsers, ,Multiple Parsers in the Same Program}.
	5556	@end deffn
	5557
	5558	@ifset defaultprec
	5559	@deffn {Directive} %no-default-prec
	5560	Do not assign a precedence to rules lacking an explicit @code{%prec}
	5561	modifier (@pxref{Contextual Precedence, ,Context-Dependent
	5562	Precedence}).
	5563	@end deffn
	5564	@end ifset
	5565
	5566	@deffn {Directive} %no-lines
	5567	Don't generate any @code{#line} preprocessor commands in the parser
	5568	implementation file. Ordinarily Bison writes these commands in the
	5569	parser implementation file so that the C compiler and debuggers will
	5570	associate errors and object code with your source file (the grammar
	5571	file). This directive causes them to associate errors with the parser
	5572	implementation file, treating it as an independent source file in its
	5573	own right.
	5574	@end deffn
	5575
	5576	@deffn {Directive} %output "@var{file}"
	5577	Generate the parser implementation in @file{@var{file}}.
	5578	@end deffn
	5579
	5580	@deffn {Directive} %pure-parser
	5581	Deprecated version of @samp{%define api.pure} (@pxref{%define
	5582	Summary,,api.pure}), for which Bison is more careful to warn about
	5583	unreasonable usage.
	5584	@end deffn
	5585
	5586	@deffn {Directive} %require "@var{version}"
	5587	Require version @var{version} or higher of Bison. @xref{Require Decl, ,
	5588	Require a Version of Bison}.
	5589	@end deffn
	5590
	5591	@deffn {Directive} %skeleton "@var{file}"
	5592	Specify the skeleton to use.
	5593
	5594	@c You probably don't need this option unless you are developing Bison.
	5595	@c You should use @code{%language} if you want to specify the skeleton for a
	5596	@c different language, because it is clearer and because it will always choose the
	5597	@c correct skeleton for non-deterministic or push parsers.
	5598
	5599	If @var{file} does not contain a @code{/}, @var{file} is the name of a skeleton
	5600	file in the Bison installation directory.
	5601	If it does, @var{file} is an absolute file name or a file name relative to the
	5602	directory of the grammar file.
	5603	This is similar to how most shells resolve commands.
	5604	@end deffn
	5605
	5606	@deffn {Directive} %token-table
	5607	Generate an array of token names in the parser implementation file.
	5608	The name of the array is @code{yytname}; @code{yytname[@var{i}]} is
	5609	the name of the token whose internal Bison token code number is
	5610	@var{i}. The first three elements of @code{yytname} correspond to the
	5611	predefined tokens @code{"$end"}, @code{"error"}, and
	5612	@code{"$undefined"}; after these come the symbols defined in the
	5613	grammar file.
	5614
	5615	The name in the table includes all the characters needed to represent
	5616	the token in Bison. For single-character literals and literal
	5617	strings, this includes the surrounding quoting characters and any
	5618	escape sequences. For example, the Bison single-character literal
	5619	@code{'+'} corresponds to a three-character name, represented in C as
	5620	@code{"'+'"}; and the Bison two-character literal string @code{"\\/"}
	5621	corresponds to a five-character name, represented in C as
	5622	@code{"\"\\\\/\""}.
	5623
	5624	When you specify @code{%token-table}, Bison also generates macro
	5625	definitions for macros @code{YYNTOKENS}, @code{YYNNTS}, and
	5626	@code{YYNRULES}, and @code{YYNSTATES}:
	5627
	5628	@table @code
	5629	@item YYNTOKENS
	5630	The highest token number, plus one.
	5631	@item YYNNTS
	5632	The number of nonterminal symbols.
	5633	@item YYNRULES
	5634	The number of grammar rules,
	5635	@item YYNSTATES
	5636	The number of parser states (@pxref{Parser States}).
	5637	@end table
	5638	@end deffn
	5639
	5640	@deffn {Directive} %verbose
	5641	Write an extra output file containing verbose descriptions of the
	5642	parser states and what is done for each type of lookahead token in
	5643	that state. @xref{Understanding, , Understanding Your Parser}, for more
	5644	information.
	5645	@end deffn
	5646
	5647	@deffn {Directive} %yacc
	5648	Pretend the option @option{--yacc} was given, i.e., imitate Yacc,
	5649	including its naming conventions. @xref{Bison Options}, for more.
	5650	@end deffn
	5651
	5652
	5653	@node %define Summary
	5654	@subsection %define Summary
	5655
	5656	There are many features of Bison's behavior that can be controlled by
	5657	assigning the feature a single value. For historical reasons, some
	5658	such features are assigned values by dedicated directives, such as
	5659	@code{%start}, which assigns the start symbol. However, newer such
	5660	features are associated with variables, which are assigned by the
	5661	@code{%define} directive:
	5662
	5663	@deffn {Directive} %define @var{variable}
	5664	@deffnx {Directive} %define @var{variable} @var{value}
	5665	@deffnx {Directive} %define @var{variable} @{@var{value}@}
	5666	@deffnx {Directive} %define @var{variable} "@var{value}"
	5667	Define @var{variable} to @var{value}.
	5668
	5669	The type of the values depend on the syntax. Braces denote value in the
	5670	target language (e.g., a namespace, a type, etc.). Keyword values (no
	5671	delimiters) denote finite choice (e.g., a variation of a feature). String
	5672	values denote remaining cases (e.g., a file name).
	5673
	5674	It is an error if a @var{variable} is defined by @code{%define} multiple
	5675	times, but see @ref{Bison Options,,-D @var{name}[=@var{value}]}.
	5676	@end deffn
	5677
	5678	The rest of this section summarizes variables and values that
	5679	@code{%define} accepts.
	5680
	5681	Some @var{variable}s take Boolean values. In this case, Bison will
	5682	complain if the variable definition does not meet one of the following
	5683	four conditions:
	5684
	5685	@enumerate
	5686	@item @code{@var{value}} is @code{true}
	5687
	5688	@item @code{@var{value}} is omitted (or @code{""} is specified).
	5689	This is equivalent to @code{true}.
	5690
	5691	@item @code{@var{value}} is @code{false}.
	5692
	5693	@item @var{variable} is never defined.
	5694	In this case, Bison selects a default value.
	5695	@end enumerate
	5696
	5697	What @var{variable}s are accepted, as well as their meanings and default
	5698	values, depend on the selected target language and/or the parser
	5699	skeleton (@pxref{Decl Summary,,%language}, @pxref{Decl
	5700	Summary,,%skeleton}).
	5701	Unaccepted @var{variable}s produce an error.
	5702	Some of the accepted @var{variable}s are described below.
	5703
	5704	@c ================================================== api.namespace
	5705	@deffn Directive {%define api.namespace} @{@var{namespace}@}
	5706	@itemize
	5707	@item Languages(s): C++
	5708
	5709	@item Purpose: Specify the namespace for the parser class.
	5710	For example, if you specify:
	5711
	5712	@example
	5713	%define api.namespace @{foo::bar@}
	5714	@end example
	5715
	5716	Bison uses @code{foo::bar} verbatim in references such as:
	5717
	5718	@example
	5719	foo::bar::parser::semantic_type
	5720	@end example
	5721
	5722	However, to open a namespace, Bison removes any leading @code{::} and then
	5723	splits on any remaining occurrences:
	5724
	5725	@example
	5726	namespace foo @{ namespace bar @{
	5727	class position;
	5728	class location;
	5729	@} @}
	5730	@end example
	5731
	5732	@item Accepted Values:
	5733	Any absolute or relative C++ namespace reference without a trailing
	5734	@code{"::"}. For example, @code{"foo"} or @code{"::foo::bar"}.
	5735
	5736	@item Default Value:
	5737	The value specified by @code{%name-prefix}, which defaults to @code{yy}.
	5738	This usage of @code{%name-prefix} is for backward compatibility and can
	5739	be confusing since @code{%name-prefix} also specifies the textual prefix
	5740	for the lexical analyzer function. Thus, if you specify
	5741	@code{%name-prefix}, it is best to also specify @samp{%define
	5742	api.namespace} so that @code{%name-prefix} @emph{only} affects the
	5743	lexical analyzer function. For example, if you specify:
	5744
	5745	@example
	5746	%define api.namespace @{foo@}
	5747	%name-prefix "bar::"
	5748	@end example
	5749
	5750	The parser namespace is @code{foo} and @code{yylex} is referenced as
	5751	@code{bar::lex}.
	5752	@end itemize
	5753	@end deffn
	5754	@c api.namespace
	5755
	5756	@c ================================================== api.location.type
	5757	@deffn {Directive} {%define api.location.type} @{@var{type}@}
	5758
	5759	@itemize @bullet
	5760	@item Language(s): C++, Java
	5761
	5762	@item Purpose: Define the location type.
	5763	@xref{User Defined Location Type}.
	5764
	5765	@item Accepted Values: String
	5766
	5767	@item Default Value: none
	5768
	5769	@item History:
	5770	Introduced in Bison 2.7 for C, C++ and Java. Introduced under the name
	5771	@code{location_type} for C++ in Bison 2.5 and for Java in Bison 2.4.
	5772	@end itemize
	5773	@end deffn
	5774
	5775	@c ================================================== api.prefix
	5776	@deffn {Directive} {%define api.prefix} @{@var{prefix}@}
	5777
	5778	@itemize @bullet
	5779	@item Language(s): All
	5780
	5781	@item Purpose: Rename exported symbols.
	5782	@xref{Multiple Parsers, ,Multiple Parsers in the Same Program}.
	5783
	5784	@item Accepted Values: String
	5785
	5786	@item Default Value: @code{yy}
	5787
	5788	@item History: introduced in Bison 2.6
	5789	@end itemize
	5790	@end deffn
	5791
	5792	@c ================================================== api.pure
	5793	@deffn Directive {%define api.pure} @var{purity}
	5794
	5795	@itemize @bullet
	5796	@item Language(s): C
	5797
	5798	@item Purpose: Request a pure (reentrant) parser program.
	5799	@xref{Pure Decl, ,A Pure (Reentrant) Parser}.
	5800
	5801	@item Accepted Values: @code{true}, @code{false}, @code{full}
	5802
	5803	The value may be omitted: this is equivalent to specifying @code{true}, as is
	5804	the case for Boolean values.
	5805
	5806	When @code{%define api.pure full} is used, the parser is made reentrant. This
	5807	changes the signature for @code{yylex} (@pxref{Pure Calling}), and also that of
	5808	@code{yyerror} when the tracking of locations has been activated, as shown
	5809	below.
	5810
	5811	The @code{true} value is very similar to the @code{full} value, the only
	5812	difference is in the signature of @code{yyerror} on Yacc parsers without
	5813	@code{%parse-param}, for historical reasons.
	5814
	5815	I.e., if @samp{%locations %define api.pure} is passed then the prototypes for
	5816	@code{yyerror} are:
	5817
	5818	@example
	5819	void yyerror (char const *msg); // Yacc parsers.
	5820	void yyerror (YYLTYPE locp, char const msg); // GLR parsers.
	5821	@end example
	5822
	5823	But if @samp{%locations %define api.pure %parse-param @{int *nastiness@}} is
	5824	used, then both parsers have the same signature:
	5825
	5826	@example
	5827	void yyerror (YYLTYPE llocp, int nastiness, char const *msg);
	5828	@end example
	5829
	5830	(@pxref{Error Reporting, ,The Error
	5831	Reporting Function @code{yyerror}})
	5832
	5833	@item Default Value: @code{false}
	5834
	5835	@item History:
	5836	the @code{full} value was introduced in Bison 2.7
	5837	@end itemize
	5838	@end deffn
	5839	@c api.pure
	5840
	5841
	5842
	5843	@c ================================================== api.push-pull
	5844	@deffn Directive {%define api.push-pull} @var{kind}
	5845
	5846	@itemize @bullet
	5847	@item Language(s): C (deterministic parsers only)
	5848
	5849	@item Purpose: Request a pull parser, a push parser, or both.
	5850	@xref{Push Decl, ,A Push Parser}.
	5851	(The current push parsing interface is experimental and may evolve.
	5852	More user feedback will help to stabilize it.)
	5853
	5854	@item Accepted Values: @code{pull}, @code{push}, @code{both}
	5855
	5856	@item Default Value: @code{pull}
	5857	@end itemize
	5858	@end deffn
	5859	@c api.push-pull
	5860
	5861
	5862
	5863	@c ================================================== api.token.constructor
	5864	@deffn Directive {%define api.token.constructor}
	5865
	5866	@itemize @bullet
	5867	@item Language(s):
	5868	C++
	5869
	5870	@item Purpose:
	5871	When variant-based semantic values are enabled (@pxref{C++ Variants}),
	5872	request that symbols be handled as a whole (type, value, and possibly
	5873	location) in the scanner. @xref{Complete Symbols}, for details.
	5874
	5875	@item Accepted Values:
	5876	Boolean.
	5877
	5878	@item Default Value:
	5879	@code{false}
	5880	@item History:
	5881	introduced in Bison 3.0
	5882	@end itemize
	5883	@end deffn
	5884	@c api.token.constructor
	5885
	5886
	5887	@c ================================================== api.token.prefix
	5888	@deffn Directive {%define api.token.prefix} @{@var{prefix}@}
	5889
	5890	@itemize
	5891	@item Languages(s): all
	5892
	5893	@item Purpose:
	5894	Add a prefix to the token names when generating their definition in the
	5895	target language. For instance
	5896
	5897	@example
	5898	%token FILE for ERROR
	5899	%define api.token.prefix @{TOK_@}
	5900	%%
	5901	start: FILE for ERROR;
	5902	@end example
	5903
	5904	@noindent
	5905	generates the definition of the symbols @code{TOK_FILE}, @code{TOK_for},
	5906	and @code{TOK_ERROR} in the generated source files. In particular, the
	5907	scanner must use these prefixed token names, while the grammar itself
	5908	may still use the short names (as in the sample rule given above). The
	5909	generated informational files (@file{.output}, @file{.xml},
	5910	@file{*.dot}) are not modified by this prefix.
	5911
	5912	Bison also prefixes the generated member names of the semantic value union.
	5913	@xref{Type Generation,, Generating the Semantic Value Type}, for more
	5914	details.
	5915
	5916	See @ref{Calc++ Parser} and @ref{Calc++ Scanner}, for a complete example.
	5917
	5918	@item Accepted Values:
	5919	Any string. Should be a valid identifier prefix in the target language,
	5920	in other words, it should typically be an identifier itself (sequence of
	5921	letters, underscores, and ---not at the beginning--- digits).
	5922
	5923	@item Default Value:
	5924	empty
	5925	@item History:
	5926	introduced in Bison 3.0
	5927	@end itemize
	5928	@end deffn
	5929	@c api.token.prefix
	5930
	5931
	5932	@c ================================================== api.value.type
	5933	@deffn Directive {%define api.value.type} @var{type}
	5934	@itemize @bullet
	5935	@item Language(s):
	5936	all
	5937
	5938	@item Purpose:
	5939	The type for semantic values.
	5940
	5941	@item Accepted Values:
	5942	@table @asis
	5943	@item @code{""}
	5944	This grammar has no semantic value at all. This is not properly supported
	5945	yet.
	5946	@item @code{%union} (C, C++)
	5947	The type is defined thanks to the @code{%union} directive. You don't have
	5948	to define @code{api.value.type} in that case, using @code{%union} suffices.
	5949	@xref{Union Decl, ,The Union Declaration}.
	5950	For instance:
	5951	@example
	5952	%define api.value.type "%union"
	5953	%union
	5954	@{
	5955	int ival;
	5956	char *sval;
	5957	@}
	5958	%token <ival> INT "integer"
	5959	%token <sval> STR "string"
	5960	@end example
	5961
	5962	@item @code{union} (C, C++)
	5963	The symbols are defined with type names, from which Bison will generate a
	5964	@code{union}. For instance:
	5965	@example
	5966	%define api.value.type "union"
	5967	%token <int> INT "integer"
	5968	%token <char *> STR "string"
	5969	@end example
	5970	This feature needs user feedback to stabilize. Note that most C++ objects
	5971	cannot be stored in a @code{union}.
	5972
	5973	@item @code{variant} (C++)
	5974	This is similar to @code{union}, but special storage techniques are used to
	5975	allow any kind of C++ object to be used. For instance:
	5976	@example
	5977	%define api.value.type "variant"
	5978	%token <int> INT "integer"
	5979	%token <std::string> STR "string"
	5980	@end example
	5981	This feature needs user feedback to stabilize.
	5982	@xref{C++ Variants}.
	5983
	5984	@item any other identifier
	5985	Use this name as semantic value.
	5986	@example
	5987	%code requires
	5988	@{
	5989	struct my_value
	5990	@{
	5991	enum
	5992	@{
	5993	is_int, is_str
	5994	@} kind;
	5995	union
	5996	@{
	5997	int ival;
	5998	char *sval;
	5999	@} u;
	6000	@};
	6001	@}
	6002	%define api.value.type "struct my_value"
	6003	%token <u.ival> INT "integer"
	6004	%token <u.sval> STR "string"
	6005	@end example
	6006	@end table
	6007
	6008	@item Default Value:
	6009	@itemize @minus
	6010	@item
	6011	@code{%union} if @code{%union} is used, otherwise @dots{}
	6012	@item
	6013	@code{int} if type tags are used (i.e., @samp{%token <@var{type}>@dots{}} or
	6014	@samp{%token <@var{type}>@dots{}} is used), otherwise @dots{}
	6015	@item
	6016	@code{""}
	6017	@end itemize
	6018
	6019	@item History:
	6020	introduced in Bison 3.0. Was introduced for Java only in 2.3b as
	6021	@code{stype}.
	6022	@end itemize
	6023	@end deffn
	6024	@c api.value.type
	6025
	6026
	6027	@c ================================================== location_type
	6028	@deffn Directive {%define location_type}
	6029	Obsoleted by @code{api.location.type} since Bison 2.7.
	6030	@end deffn
	6031
	6032
	6033	@c ================================================== lr.default-reduction
	6034
	6035	@deffn Directive {%define lr.default-reduction} @var{when}
	6036
	6037	@itemize @bullet
	6038	@item Language(s): all
	6039
	6040	@item Purpose: Specify the kind of states that are permitted to
	6041	contain default reductions. @xref{Default Reductions}. (The ability to
	6042	specify where default reductions should be used is experimental. More user
	6043	feedback will help to stabilize it.)
	6044
	6045	@item Accepted Values: @code{most}, @code{consistent}, @code{accepting}
	6046	@item Default Value:
	6047	@itemize
	6048	@item @code{accepting} if @code{lr.type} is @code{canonical-lr}.
	6049	@item @code{most} otherwise.
	6050	@end itemize
	6051	@item History:
	6052	introduced as @code{lr.default-reductions} in 2.5, renamed as
	6053	@code{lr.default-reduction} in 3.0.
	6054	@end itemize
	6055	@end deffn
	6056
	6057	@c ============================================ lr.keep-unreachable-state
	6058
	6059	@deffn Directive {%define lr.keep-unreachable-state}
	6060
	6061	@itemize @bullet
	6062	@item Language(s): all
	6063	@item Purpose: Request that Bison allow unreachable parser states to
	6064	remain in the parser tables. @xref{Unreachable States}.
	6065	@item Accepted Values: Boolean
	6066	@item Default Value: @code{false}
	6067	@item History:
	6068	introduced as @code{lr.keep_unreachable_states} in 2.3b, renamed as
	6069	@code{lr.keep-unreachable-states} in 2.5, and as
	6070	@code{lr.keep-unreachable-state} in 3.0.
	6071	@end itemize
	6072	@end deffn
	6073	@c lr.keep-unreachable-state
	6074
	6075	@c ================================================== lr.type
	6076
	6077	@deffn Directive {%define lr.type} @var{type}
	6078
	6079	@itemize @bullet
	6080	@item Language(s): all
	6081
	6082	@item Purpose: Specify the type of parser tables within the
	6083	LR(1) family. @xref{LR Table Construction}. (This feature is experimental.
	6084	More user feedback will help to stabilize it.)
	6085
	6086	@item Accepted Values: @code{lalr}, @code{ielr}, @code{canonical-lr}
	6087
	6088	@item Default Value: @code{lalr}
	6089	@end itemize
	6090	@end deffn
	6091
	6092	@c ================================================== namespace
	6093	@deffn Directive %define namespace @{@var{namespace}@}
	6094	Obsoleted by @code{api.namespace}
	6095	@c namespace
	6096	@end deffn
	6097
	6098	@c ================================================== parse.assert
	6099	@deffn Directive {%define parse.assert}
	6100
	6101	@itemize
	6102	@item Languages(s): C++
	6103
	6104	@item Purpose: Issue runtime assertions to catch invalid uses.
	6105	In C++, when variants are used (@pxref{C++ Variants}), symbols must be
	6106	constructed and
	6107	destroyed properly. This option checks these constraints.
	6108
	6109	@item Accepted Values: Boolean
	6110
	6111	@item Default Value: @code{false}
	6112	@end itemize
	6113	@end deffn
	6114	@c parse.assert
	6115
	6116
	6117	@c ================================================== parse.error
	6118	@deffn Directive {%define parse.error}
	6119	@itemize
	6120	@item Languages(s):
	6121	all
	6122	@item Purpose:
	6123	Control the kind of error messages passed to the error reporting
	6124	function. @xref{Error Reporting, ,The Error Reporting Function
	6125	@code{yyerror}}.
	6126	@item Accepted Values:
	6127	@itemize
	6128	@item @code{simple}
	6129	Error messages passed to @code{yyerror} are simply @w{@code{"syntax
	6130	error"}}.
	6131	@item @code{verbose}
	6132	Error messages report the unexpected token, and possibly the expected ones.
	6133	However, this report can often be incorrect when LAC is not enabled
	6134	(@pxref{LAC}).
	6135	@end itemize
	6136
	6137	@item Default Value:
	6138	@code{simple}
	6139	@end itemize
	6140	@end deffn
	6141	@c parse.error
	6142
	6143
	6144	@c ================================================== parse.lac
	6145	@deffn Directive {%define parse.lac}
	6146
	6147	@itemize
	6148	@item Languages(s): C (deterministic parsers only)
	6149
	6150	@item Purpose: Enable LAC (lookahead correction) to improve
	6151	syntax error handling. @xref{LAC}.
	6152	@item Accepted Values: @code{none}, @code{full}
	6153	@item Default Value: @code{none}
	6154	@end itemize
	6155	@end deffn
	6156	@c parse.lac
	6157
	6158	@c ================================================== parse.trace
	6159	@deffn Directive {%define parse.trace}
	6160
	6161	@itemize
	6162	@item Languages(s): C, C++, Java
	6163
	6164	@item Purpose: Require parser instrumentation for tracing.
	6165	@xref{Tracing, ,Tracing Your Parser}.
	6166
	6167	In C/C++, define the macro @code{YYDEBUG} (or @code{@var{prefix}DEBUG} with
	6168	@samp{%define api.prefix @var{prefix}}), see @ref{Multiple Parsers,
	6169	,Multiple Parsers in the Same Program}) to 1 in the parser implementation
	6170	file if it is not already defined, so that the debugging facilities are
	6171	compiled.
	6172
	6173	@item Accepted Values: Boolean
	6174
	6175	@item Default Value: @code{false}
	6176	@end itemize
	6177	@end deffn
	6178	@c parse.trace
	6179
	6180	@node %code Summary
	6181	@subsection %code Summary
	6182	@findex %code
	6183	@cindex Prologue
	6184
	6185	The @code{%code} directive inserts code verbatim into the output
	6186	parser source at any of a predefined set of locations. It thus serves
	6187	as a flexible and user-friendly alternative to the traditional Yacc
	6188	prologue, @code{%@{@var{code}%@}}. This section summarizes the
	6189	functionality of @code{%code} for the various target languages
	6190	supported by Bison. For a detailed discussion of how to use
	6191	@code{%code} in place of @code{%@{@var{code}%@}} for C/C++ and why it
	6192	is advantageous to do so, @pxref{Prologue Alternatives}.
	6193
	6194	@deffn {Directive} %code @{@var{code}@}
	6195	This is the unqualified form of the @code{%code} directive. It
	6196	inserts @var{code} verbatim at a language-dependent default location
	6197	in the parser implementation.
	6198
	6199	For C/C++, the default location is the parser implementation file
	6200	after the usual contents of the parser header file. Thus, the
	6201	unqualified form replaces @code{%@{@var{code}%@}} for most purposes.
	6202
	6203	For Java, the default location is inside the parser class.
	6204	@end deffn
	6205
	6206	@deffn {Directive} %code @var{qualifier} @{@var{code}@}
	6207	This is the qualified form of the @code{%code} directive.
	6208	@var{qualifier} identifies the purpose of @var{code} and thus the
	6209	location(s) where Bison should insert it. That is, if you need to
	6210	specify location-sensitive @var{code} that does not belong at the
	6211	default location selected by the unqualified @code{%code} form, use
	6212	this form instead.
	6213	@end deffn
	6214
	6215	For any particular qualifier or for the unqualified form, if there are
	6216	multiple occurrences of the @code{%code} directive, Bison concatenates
	6217	the specified code in the order in which it appears in the grammar
	6218	file.
	6219
	6220	Not all qualifiers are accepted for all target languages. Unaccepted
	6221	qualifiers produce an error. Some of the accepted qualifiers are:
	6222
	6223	@table @code
	6224	@item requires
	6225	@findex %code requires
	6226
	6227	@itemize @bullet
	6228	@item Language(s): C, C++
	6229
	6230	@item Purpose: This is the best place to write dependency code required for
	6231	@code{YYSTYPE} and @code{YYLTYPE}. In other words, it's the best place to
	6232	define types referenced in @code{%union} directives. If you use
	6233	@code{#define} to override Bison's default @code{YYSTYPE} and @code{YYLTYPE}
	6234	definitions, then it is also the best place. However you should rather
	6235	@code{%define} @code{api.value.type} and @code{api.location.type}.
	6236
	6237	@item Location(s): The parser header file and the parser implementation file
	6238	before the Bison-generated @code{YYSTYPE} and @code{YYLTYPE}
	6239	definitions.
	6240	@end itemize
	6241
	6242	@item provides
	6243	@findex %code provides
	6244
	6245	@itemize @bullet
	6246	@item Language(s): C, C++
	6247
	6248	@item Purpose: This is the best place to write additional definitions and
	6249	declarations that should be provided to other modules.
	6250
	6251	@item Location(s): The parser header file and the parser implementation
	6252	file after the Bison-generated @code{YYSTYPE}, @code{YYLTYPE}, and
	6253	token definitions.
	6254	@end itemize
	6255
	6256	@item top
	6257	@findex %code top
	6258
	6259	@itemize @bullet
	6260	@item Language(s): C, C++
	6261
	6262	@item Purpose: The unqualified @code{%code} or @code{%code requires}
	6263	should usually be more appropriate than @code{%code top}. However,
	6264	occasionally it is necessary to insert code much nearer the top of the
	6265	parser implementation file. For example:
	6266
	6267	@example
	6268	%code top @{
	6269	#define _GNU_SOURCE
	6270	#include <stdio.h>
	6271	@}
	6272	@end example
	6273
	6274	@item Location(s): Near the top of the parser implementation file.
	6275	@end itemize
	6276
	6277	@item imports
	6278	@findex %code imports
	6279
	6280	@itemize @bullet
	6281	@item Language(s): Java
	6282
	6283	@item Purpose: This is the best place to write Java import directives.
	6284
	6285	@item Location(s): The parser Java file after any Java package directive and
	6286	before any class definitions.
	6287	@end itemize
	6288	@end table
	6289
	6290	Though we say the insertion locations are language-dependent, they are
	6291	technically skeleton-dependent. Writers of non-standard skeletons
	6292	however should choose their locations consistently with the behavior
	6293	of the standard Bison skeletons.
	6294
	6295
	6296	@node Multiple Parsers
	6297	@section Multiple Parsers in the Same Program
	6298
	6299	Most programs that use Bison parse only one language and therefore contain
	6300	only one Bison parser. But what if you want to parse more than one language
	6301	with the same program? Then you need to avoid name conflicts between
	6302	different definitions of functions and variables such as @code{yyparse},
	6303	@code{yylval}. To use different parsers from the same compilation unit, you
	6304	also need to avoid conflicts on types and macros (e.g., @code{YYSTYPE})
	6305	exported in the generated header.
	6306
	6307	The easy way to do this is to define the @code{%define} variable
	6308	@code{api.prefix}. With different @code{api.prefix}s it is guaranteed that
	6309	headers do not conflict when included together, and that compiled objects
	6310	can be linked together too. Specifying @samp{%define api.prefix
	6311	@var{prefix}} (or passing the option @samp{-Dapi.prefix=@var{prefix}}, see
	6312	@ref{Invocation, ,Invoking Bison}) renames the interface functions and
	6313	variables of the Bison parser to start with @var{prefix} instead of
	6314	@samp{yy}, and all the macros to start by @var{PREFIX} (i.e., @var{prefix}
	6315	upper-cased) instead of @samp{YY}.
	6316
	6317	The renamed symbols include @code{yyparse}, @code{yylex}, @code{yyerror},
	6318	@code{yynerrs}, @code{yylval}, @code{yylloc}, @code{yychar} and
	6319	@code{yydebug}. If you use a push parser, @code{yypush_parse},
	6320	@code{yypull_parse}, @code{yypstate}, @code{yypstate_new} and
	6321	@code{yypstate_delete} will also be renamed. The renamed macros include
	6322	@code{YYSTYPE}, @code{YYLTYPE}, and @code{YYDEBUG}, which is treated
	6323	specifically --- more about this below.
	6324
	6325	For example, if you use @samp{%define api.prefix c}, the names become
	6326	@code{cparse}, @code{clex}, @dots{}, @code{CSTYPE}, @code{CLTYPE}, and so
	6327	on.
	6328
	6329	The @code{%define} variable @code{api.prefix} works in two different ways.
	6330	In the implementation file, it works by adding macro definitions to the
	6331	beginning of the parser implementation file, defining @code{yyparse} as
	6332	@code{@var{prefix}parse}, and so on:
	6333
	6334	@example
	6335	#define YYSTYPE CTYPE
	6336	#define yyparse cparse
	6337	#define yylval clval
	6338	...
	6339	YYSTYPE yylval;
	6340	int yyparse (void);
	6341	@end example
	6342
	6343	This effectively substitutes one name for the other in the entire parser
	6344	implementation file, thus the ``original'' names (@code{yylex},
	6345	@code{YYSTYPE}, @dots{}) are also usable in the parser implementation file.
	6346
	6347	However, in the parser header file, the symbols are defined renamed, for
	6348	instance:
	6349
	6350	@example
	6351	extern CSTYPE clval;
	6352	int cparse (void);
	6353	@end example
	6354
	6355	The macro @code{YYDEBUG} is commonly used to enable the tracing support in
	6356	parsers. To comply with this tradition, when @code{api.prefix} is used,
	6357	@code{YYDEBUG} (not renamed) is used as a default value:
	6358
	6359	@example
	6360	/* Debug traces. */
	6361	#ifndef CDEBUG
	6362	# if defined YYDEBUG
	6363	# if YYDEBUG
	6364	# define CDEBUG 1
	6365	# else
	6366	# define CDEBUG 0
	6367	# endif
	6368	# else
	6369	# define CDEBUG 0
	6370	# endif
	6371	#endif
	6372	#if CDEBUG
	6373	extern int cdebug;
	6374	#endif
	6375	@end example
	6376
	6377	@sp 2
	6378
	6379	Prior to Bison 2.6, a feature similar to @code{api.prefix} was provided by
	6380	the obsolete directive @code{%name-prefix} (@pxref{Table of Symbols, ,Bison
	6381	Symbols}) and the option @code{--name-prefix} (@pxref{Bison Options}).
	6382
	6383	@node Interface
	6384	@chapter Parser C-Language Interface
	6385	@cindex C-language interface
	6386	@cindex interface
	6387
	6388	The Bison parser is actually a C function named @code{yyparse}. Here we
	6389	describe the interface conventions of @code{yyparse} and the other
	6390	functions that it needs to use.
	6391
	6392	Keep in mind that the parser uses many C identifiers starting with
	6393	@samp{yy} and @samp{YY} for internal purposes. If you use such an
	6394	identifier (aside from those in this manual) in an action or in epilogue
	6395	in the grammar file, you are likely to run into trouble.
	6396
	6397	@menu
	6398	* Parser Function:: How to call @code{yyparse} and what it returns.
	6399	* Push Parser Function:: How to call @code{yypush_parse} and what it returns.
	6400	* Pull Parser Function:: How to call @code{yypull_parse} and what it returns.
	6401	* Parser Create Function:: How to call @code{yypstate_new} and what it returns.
	6402	* Parser Delete Function:: How to call @code{yypstate_delete} and what it returns.
	6403	* Lexical:: You must supply a function @code{yylex}
	6404	which reads tokens.
	6405	* Error Reporting:: You must supply a function @code{yyerror}.
	6406	* Action Features:: Special features for use in actions.
	6407	* Internationalization:: How to let the parser speak in the user's
	6408	native language.
	6409	@end menu
	6410
	6411	@node Parser Function
	6412	@section The Parser Function @code{yyparse}
	6413	@findex yyparse
	6414
	6415	You call the function @code{yyparse} to cause parsing to occur. This
	6416	function reads tokens, executes actions, and ultimately returns when it
	6417	encounters end-of-input or an unrecoverable syntax error. You can also
	6418	write an action which directs @code{yyparse} to return immediately
	6419	without reading further.
	6420
	6421
	6422	@deftypefun int yyparse (void)
	6423	The value returned by @code{yyparse} is 0 if parsing was successful (return
	6424	is due to end-of-input).
	6425
	6426	The value is 1 if parsing failed because of invalid input, i.e., input
	6427	that contains a syntax error or that causes @code{YYABORT} to be
	6428	invoked.
	6429
	6430	The value is 2 if parsing failed due to memory exhaustion.
	6431	@end deftypefun
	6432
	6433	In an action, you can cause immediate return from @code{yyparse} by using
	6434	these macros:
	6435
	6436	@defmac YYACCEPT
	6437	@findex YYACCEPT
	6438	Return immediately with value 0 (to report success).
	6439	@end defmac
	6440
	6441	@defmac YYABORT
	6442	@findex YYABORT
	6443	Return immediately with value 1 (to report failure).
	6444	@end defmac
	6445
	6446	If you use a reentrant parser, you can optionally pass additional
	6447	parameter information to it in a reentrant way. To do so, use the
	6448	declaration @code{%parse-param}:
	6449
	6450	@deffn {Directive} %parse-param @{@var{argument-declaration}@} @dots{}
	6451	@findex %parse-param
	6452	Declare that one or more
	6453	@var{argument-declaration} are additional @code{yyparse} arguments.
	6454	The @var{argument-declaration} is used when declaring
	6455	functions or prototypes. The last identifier in
	6456	@var{argument-declaration} must be the argument name.
	6457	@end deffn
	6458
	6459	Here's an example. Write this in the parser:
	6460
	6461	@example
	6462	%parse-param @{int nastiness@} @{int randomness@}
	6463	@end example
	6464
	6465	@noindent
	6466	Then call the parser like this:
	6467
	6468	@example
	6469	@{
	6470	int nastiness, randomness;
	6471	@dots{} /* @r{Store proper data in @code{nastiness} and @code{randomness}.} */
	6472	value = yyparse (&nastiness, &randomness);
	6473	@dots{}
	6474	@}
	6475	@end example
	6476
	6477	@noindent
	6478	In the grammar actions, use expressions like this to refer to the data:
	6479
	6480	@example
	6481	exp: @dots{} @{ @dots{}; *randomness += 1; @dots{} @}
	6482	@end example
	6483
	6484	@noindent
	6485	Using the following:
	6486	@example
	6487	%parse-param @{int *randomness@}
	6488	@end example
	6489
	6490	Results in these signatures:
	6491	@example
	6492	void yyerror (int randomness, const char msg);
	6493	int yyparse (int *randomness);
	6494	@end example
	6495
	6496	@noindent
	6497	Or, if both @code{%define api.pure full} (or just @code{%define api.pure})
	6498	and @code{%locations} are used:
	6499
	6500	@example
	6501	void yyerror (YYLTYPE llocp, int randomness, const char *msg);
	6502	int yyparse (int *randomness);
	6503	@end example
	6504
	6505	@node Push Parser Function
	6506	@section The Push Parser Function @code{yypush_parse}
	6507	@findex yypush_parse
	6508
	6509	(The current push parsing interface is experimental and may evolve.
	6510	More user feedback will help to stabilize it.)
	6511
	6512	You call the function @code{yypush_parse} to parse a single token. This
	6513	function is available if either the @samp{%define api.push-pull push} or
	6514	@samp{%define api.push-pull both} declaration is used.
	6515	@xref{Push Decl, ,A Push Parser}.
	6516
	6517	@deftypefun int yypush_parse (yypstate *@var{yyps})
	6518	The value returned by @code{yypush_parse} is the same as for yyparse with
	6519	the following exception: it returns @code{YYPUSH_MORE} if more input is
	6520	required to finish parsing the grammar.
	6521	@end deftypefun
	6522
	6523	@node Pull Parser Function
	6524	@section The Pull Parser Function @code{yypull_parse}
	6525	@findex yypull_parse
	6526
	6527	(The current push parsing interface is experimental and may evolve.
	6528	More user feedback will help to stabilize it.)
	6529
	6530	You call the function @code{yypull_parse} to parse the rest of the input
	6531	stream. This function is available if the @samp{%define api.push-pull both}
	6532	declaration is used.
	6533	@xref{Push Decl, ,A Push Parser}.
	6534
	6535	@deftypefun int yypull_parse (yypstate *@var{yyps})
	6536	The value returned by @code{yypull_parse} is the same as for @code{yyparse}.
	6537	@end deftypefun
	6538
	6539	@node Parser Create Function
	6540	@section The Parser Create Function @code{yystate_new}
	6541	@findex yypstate_new
	6542
	6543	(The current push parsing interface is experimental and may evolve.
	6544	More user feedback will help to stabilize it.)
	6545
	6546	You call the function @code{yypstate_new} to create a new parser instance.
	6547	This function is available if either the @samp{%define api.push-pull push} or
	6548	@samp{%define api.push-pull both} declaration is used.
	6549	@xref{Push Decl, ,A Push Parser}.
	6550
	6551	@deftypefun {yypstate*} yypstate_new (void)
	6552	The function will return a valid parser instance if there was memory available
	6553	or 0 if no memory was available.
	6554	In impure mode, it will also return 0 if a parser instance is currently
	6555	allocated.
	6556	@end deftypefun
	6557
	6558	@node Parser Delete Function
	6559	@section The Parser Delete Function @code{yystate_delete}
	6560	@findex yypstate_delete
	6561
	6562	(The current push parsing interface is experimental and may evolve.
	6563	More user feedback will help to stabilize it.)
	6564
	6565	You call the function @code{yypstate_delete} to delete a parser instance.
	6566	function is available if either the @samp{%define api.push-pull push} or
	6567	@samp{%define api.push-pull both} declaration is used.
	6568	@xref{Push Decl, ,A Push Parser}.
	6569
	6570	@deftypefun void yypstate_delete (yypstate *@var{yyps})
	6571	This function will reclaim the memory associated with a parser instance.
	6572	After this call, you should no longer attempt to use the parser instance.
	6573	@end deftypefun
	6574
	6575	@node Lexical
	6576	@section The Lexical Analyzer Function @code{yylex}
	6577	@findex yylex
	6578	@cindex lexical analyzer
	6579
	6580	The @dfn{lexical analyzer} function, @code{yylex}, recognizes tokens from
	6581	the input stream and returns them to the parser. Bison does not create
	6582	this function automatically; you must write it so that @code{yyparse} can
	6583	call it. The function is sometimes referred to as a lexical scanner.
	6584
	6585	In simple programs, @code{yylex} is often defined at the end of the
	6586	Bison grammar file. If @code{yylex} is defined in a separate source
	6587	file, you need to arrange for the token-type macro definitions to be
	6588	available there. To do this, use the @samp{-d} option when you run
	6589	Bison, so that it will write these macro definitions into the separate
	6590	parser header file, @file{@var{name}.tab.h}, which you can include in
	6591	the other source files that need it. @xref{Invocation, ,Invoking
	6592	Bison}.
	6593
	6594	@menu
	6595	* Calling Convention:: How @code{yyparse} calls @code{yylex}.
	6596	* Token Values:: How @code{yylex} must return the semantic value
	6597	of the token it has read.
	6598	* Token Locations:: How @code{yylex} must return the text location
	6599	(line number, etc.) of the token, if the
	6600	actions want that.
	6601	* Pure Calling:: How the calling convention differs in a pure parser
	6602	(@pxref{Pure Decl, ,A Pure (Reentrant) Parser}).
	6603	@end menu
	6604
	6605	@node Calling Convention
	6606	@subsection Calling Convention for @code{yylex}
	6607
	6608	The value that @code{yylex} returns must be the positive numeric code
	6609	for the type of token it has just found; a zero or negative value
	6610	signifies end-of-input.
	6611
	6612	When a token is referred to in the grammar rules by a name, that name
	6613	in the parser implementation file becomes a C macro whose definition
	6614	is the proper numeric code for that token type. So @code{yylex} can
	6615	use the name to indicate that type. @xref{Symbols}.
	6616
	6617	When a token is referred to in the grammar rules by a character literal,
	6618	the numeric code for that character is also the code for the token type.
	6619	So @code{yylex} can simply return that character code, possibly converted
	6620	to @code{unsigned char} to avoid sign-extension. The null character
	6621	must not be used this way, because its code is zero and that
	6622	signifies end-of-input.
	6623
	6624	Here is an example showing these things:
	6625
	6626	@example
	6627	int
	6628	yylex (void)
	6629	@{
	6630	@dots{}
	6631	if (c == EOF) /* Detect end-of-input. */
	6632	return 0;
	6633	@dots{}
	6634	if (c == '+' \|\| c == '-')
	6635	return c; /* Assume token type for '+' is '+'. */
	6636	@dots{}
	6637	return INT; /* Return the type of the token. */
	6638	@dots{}
	6639	@}
	6640	@end example
	6641
	6642	@noindent
	6643	This interface has been designed so that the output from the @code{lex}
	6644	utility can be used without change as the definition of @code{yylex}.
	6645
	6646	If the grammar uses literal string tokens, there are two ways that
	6647	@code{yylex} can determine the token type codes for them:
	6648
	6649	@itemize @bullet
	6650	@item
	6651	If the grammar defines symbolic token names as aliases for the
	6652	literal string tokens, @code{yylex} can use these symbolic names like
	6653	all others. In this case, the use of the literal string tokens in
	6654	the grammar file has no effect on @code{yylex}.
	6655
	6656	@item
	6657	@code{yylex} can find the multicharacter token in the @code{yytname}
	6658	table. The index of the token in the table is the token type's code.
	6659	The name of a multicharacter token is recorded in @code{yytname} with a
	6660	double-quote, the token's characters, and another double-quote. The
	6661	token's characters are escaped as necessary to be suitable as input
	6662	to Bison.
	6663
	6664	Here's code for looking up a multicharacter token in @code{yytname},
	6665	assuming that the characters of the token are stored in
	6666	@code{token_buffer}, and assuming that the token does not contain any
	6667	characters like @samp{"} that require escaping.
	6668
	6669	@example
	6670	for (i = 0; i < YYNTOKENS; i++)
	6671	@{
	6672	if (yytname[i] != 0
	6673	&& yytname[i][0] == '"'
	6674	&& ! strncmp (yytname[i] + 1, token_buffer,
	6675	strlen (token_buffer))
	6676	&& yytname[i][strlen (token_buffer) + 1] == '"'
	6677	&& yytname[i][strlen (token_buffer) + 2] == 0)
	6678	break;
	6679	@}
	6680	@end example
	6681
	6682	The @code{yytname} table is generated only if you use the
	6683	@code{%token-table} declaration. @xref{Decl Summary}.
	6684	@end itemize
	6685
	6686	@node Token Values
	6687	@subsection Semantic Values of Tokens
	6688
	6689	@vindex yylval
	6690	In an ordinary (nonreentrant) parser, the semantic value of the token must
	6691	be stored into the global variable @code{yylval}. When you are using
	6692	just one data type for semantic values, @code{yylval} has that type.
	6693	Thus, if the type is @code{int} (the default), you might write this in
	6694	@code{yylex}:
	6695
	6696	@example
	6697	@group
	6698	@dots{}
	6699	yylval = value; /* Put value onto Bison stack. */
	6700	return INT; /* Return the type of the token. */
	6701	@dots{}
	6702	@end group
	6703	@end example
	6704
	6705	When you are using multiple data types, @code{yylval}'s type is a union
	6706	made from the @code{%union} declaration (@pxref{Union Decl, ,The
	6707	Union Declaration}). So when you store a token's value, you
	6708	must use the proper member of the union. If the @code{%union}
	6709	declaration looks like this:
	6710
	6711	@example
	6712	@group
	6713	%union @{
	6714	int intval;
	6715	double val;
	6716	symrec *tptr;
	6717	@}
	6718	@end group
	6719	@end example
	6720
	6721	@noindent
	6722	then the code in @code{yylex} might look like this:
	6723
	6724	@example
	6725	@group
	6726	@dots{}
	6727	yylval.intval = value; /* Put value onto Bison stack. */
	6728	return INT; /* Return the type of the token. */
	6729	@dots{}
	6730	@end group
	6731	@end example
	6732
	6733	@node Token Locations
	6734	@subsection Textual Locations of Tokens
	6735
	6736	@vindex yylloc
	6737	If you are using the @samp{@@@var{n}}-feature (@pxref{Tracking Locations})
	6738	in actions to keep track of the textual locations of tokens and groupings,
	6739	then you must provide this information in @code{yylex}. The function
	6740	@code{yyparse} expects to find the textual location of a token just parsed
	6741	in the global variable @code{yylloc}. So @code{yylex} must store the proper
	6742	data in that variable.
	6743
	6744	By default, the value of @code{yylloc} is a structure and you need only
	6745	initialize the members that are going to be used by the actions. The
	6746	four members are called @code{first_line}, @code{first_column},
	6747	@code{last_line} and @code{last_column}. Note that the use of this
	6748	feature makes the parser noticeably slower.
	6749
	6750	@tindex YYLTYPE
	6751	The data type of @code{yylloc} has the name @code{YYLTYPE}.
	6752
	6753	@node Pure Calling
	6754	@subsection Calling Conventions for Pure Parsers
	6755
	6756	When you use the Bison declaration @code{%define api.pure full} to request a
	6757	pure, reentrant parser, the global communication variables @code{yylval}
	6758	and @code{yylloc} cannot be used. (@xref{Pure Decl, ,A Pure (Reentrant)
	6759	Parser}.) In such parsers the two global variables are replaced by
	6760	pointers passed as arguments to @code{yylex}. You must declare them as
	6761	shown here, and pass the information back by storing it through those
	6762	pointers.
	6763
	6764	@example
	6765	int
	6766	yylex (YYSTYPE lvalp, YYLTYPE llocp)
	6767	@{
	6768	@dots{}
	6769	lvalp = value; / Put value onto Bison stack. */
	6770	return INT; /* Return the type of the token. */
	6771	@dots{}
	6772	@}
	6773	@end example
	6774
	6775	If the grammar file does not use the @samp{@@} constructs to refer to
	6776	textual locations, then the type @code{YYLTYPE} will not be defined. In
	6777	this case, omit the second argument; @code{yylex} will be called with
	6778	only one argument.
	6779
	6780	If you wish to pass additional arguments to @code{yylex}, use
	6781	@code{%lex-param} just like @code{%parse-param} (@pxref{Parser
	6782	Function}). To pass additional arguments to both @code{yylex} and
	6783	@code{yyparse}, use @code{%param}.
	6784
	6785	@deffn {Directive} %lex-param @{@var{argument-declaration}@} @dots{}
	6786	@findex %lex-param
	6787	Specify that @var{argument-declaration} are additional @code{yylex} argument
	6788	declarations. You may pass one or more such declarations, which is
	6789	equivalent to repeating @code{%lex-param}.
	6790	@end deffn
	6791
	6792	@deffn {Directive} %param @{@var{argument-declaration}@} @dots{}
	6793	@findex %param
	6794	Specify that @var{argument-declaration} are additional
	6795	@code{yylex}/@code{yyparse} argument declaration. This is equivalent to
	6796	@samp{%lex-param @{@var{argument-declaration}@} @dots{} %parse-param
	6797	@{@var{argument-declaration}@} @dots{}}. You may pass one or more
	6798	declarations, which is equivalent to repeating @code{%param}.
	6799	@end deffn
	6800
	6801	@noindent
	6802	For instance:
	6803
	6804	@example
	6805	%lex-param @{scanner_mode *mode@}
	6806	%parse-param @{parser_mode *mode@}
	6807	%param @{environment_type *env@}
	6808	@end example
	6809
	6810	@noindent
	6811	results in the following signatures:
	6812
	6813	@example
	6814	int yylex (scanner_mode mode, environment_type env);
	6815	int yyparse (parser_mode mode, environment_type env);
	6816	@end example
	6817
	6818	If @samp{%define api.pure full} is added:
	6819
	6820	@example
	6821	int yylex (YYSTYPE lvalp, scanner_mode mode, environment_type *env);
	6822	int yyparse (parser_mode mode, environment_type env);
	6823	@end example
	6824
	6825	@noindent
	6826	and finally, if both @samp{%define api.pure full} and @code{%locations} are
	6827	used:
	6828
	6829	@example
	6830	int yylex (YYSTYPE lvalp, YYLTYPE llocp,
	6831	scanner_mode mode, environment_type env);
	6832	int yyparse (parser_mode mode, environment_type env);
	6833	@end example
	6834
	6835	@node Error Reporting
	6836	@section The Error Reporting Function @code{yyerror}
	6837	@cindex error reporting function
	6838	@findex yyerror
	6839	@cindex parse error
	6840	@cindex syntax error
	6841
	6842	The Bison parser detects a @dfn{syntax error} (or @dfn{parse error})
	6843	whenever it reads a token which cannot satisfy any syntax rule. An
	6844	action in the grammar can also explicitly proclaim an error, using the
	6845	macro @code{YYERROR} (@pxref{Action Features, ,Special Features for Use
	6846	in Actions}).
	6847
	6848	The Bison parser expects to report the error by calling an error
	6849	reporting function named @code{yyerror}, which you must supply. It is
	6850	called by @code{yyparse} whenever a syntax error is found, and it
	6851	receives one argument. For a syntax error, the string is normally
	6852	@w{@code{"syntax error"}}.
	6853
	6854	@findex %define parse.error
	6855	If you invoke @samp{%define parse.error verbose} in the Bison declarations
	6856	section (@pxref{Bison Declarations, ,The Bison Declarations Section}), then
	6857	Bison provides a more verbose and specific error message string instead of
	6858	just plain @w{@code{"syntax error"}}. However, that message sometimes
	6859	contains incorrect information if LAC is not enabled (@pxref{LAC}).
	6860
	6861	The parser can detect one other kind of error: memory exhaustion. This
	6862	can happen when the input contains constructions that are very deeply
	6863	nested. It isn't likely you will encounter this, since the Bison
	6864	parser normally extends its stack automatically up to a very large limit. But
	6865	if memory is exhausted, @code{yyparse} calls @code{yyerror} in the usual
	6866	fashion, except that the argument string is @w{@code{"memory exhausted"}}.
	6867
	6868	In some cases diagnostics like @w{@code{"syntax error"}} are
	6869	translated automatically from English to some other language before
	6870	they are passed to @code{yyerror}. @xref{Internationalization}.
	6871
	6872	The following definition suffices in simple programs:
	6873
	6874	@example
	6875	@group
	6876	void
	6877	yyerror (char const *s)
	6878	@{
	6879	@end group
	6880	@group
	6881	fprintf (stderr, "%s\n", s);
	6882	@}
	6883	@end group
	6884	@end example
	6885
	6886	After @code{yyerror} returns to @code{yyparse}, the latter will attempt
	6887	error recovery if you have written suitable error recovery grammar rules
	6888	(@pxref{Error Recovery}). If recovery is impossible, @code{yyparse} will
	6889	immediately return 1.
	6890
	6891	Obviously, in location tracking pure parsers, @code{yyerror} should have
	6892	an access to the current location. With @code{%define api.pure}, this is
	6893	indeed the case for the GLR parsers, but not for the Yacc parser, for
	6894	historical reasons, and this is the why @code{%define api.pure full} should be
	6895	prefered over @code{%define api.pure}.
	6896
	6897	When @code{%locations %define api.pure full} is used, @code{yyerror} has the
	6898	following signature:
	6899
	6900	@example
	6901	void yyerror (YYLTYPE locp, char const msg);
	6902	@end example
	6903
	6904	@noindent
	6905	The prototypes are only indications of how the code produced by Bison
	6906	uses @code{yyerror}. Bison-generated code always ignores the returned
	6907	value, so @code{yyerror} can return any type, including @code{void}.
	6908	Also, @code{yyerror} can be a variadic function; that is why the
	6909	message is always passed last.
	6910
	6911	Traditionally @code{yyerror} returns an @code{int} that is always
	6912	ignored, but this is purely for historical reasons, and @code{void} is
	6913	preferable since it more accurately describes the return type for
	6914	@code{yyerror}.
	6915
	6916	@vindex yynerrs
	6917	The variable @code{yynerrs} contains the number of syntax errors
	6918	reported so far. Normally this variable is global; but if you
	6919	request a pure parser (@pxref{Pure Decl, ,A Pure (Reentrant) Parser})
	6920	then it is a local variable which only the actions can access.
	6921
	6922	@node Action Features
	6923	@section Special Features for Use in Actions
	6924	@cindex summary, action features
	6925	@cindex action features summary
	6926
	6927	Here is a table of Bison constructs, variables and macros that
	6928	are useful in actions.
	6929
	6930	@deffn {Variable} $$
	6931	Acts like a variable that contains the semantic value for the
	6932	grouping made by the current rule. @xref{Actions}.
	6933	@end deffn
	6934
	6935	@deffn {Variable} $@var{n}
	6936	Acts like a variable that contains the semantic value for the
	6937	@var{n}th component of the current rule. @xref{Actions}.
	6938	@end deffn
	6939
	6940	@deffn {Variable} $<@var{typealt}>$
	6941	Like @code{$$} but specifies alternative @var{typealt} in the union
	6942	specified by the @code{%union} declaration. @xref{Action Types, ,Data
	6943	Types of Values in Actions}.
	6944	@end deffn
	6945
	6946	@deffn {Variable} $<@var{typealt}>@var{n}
	6947	Like @code{$@var{n}} but specifies alternative @var{typealt} in the
	6948	union specified by the @code{%union} declaration.
	6949	@xref{Action Types, ,Data Types of Values in Actions}.
	6950	@end deffn
	6951
	6952	@deffn {Macro} YYABORT @code{;}
	6953	Return immediately from @code{yyparse}, indicating failure.
	6954	@xref{Parser Function, ,The Parser Function @code{yyparse}}.
	6955	@end deffn
	6956
	6957	@deffn {Macro} YYACCEPT @code{;}
	6958	Return immediately from @code{yyparse}, indicating success.
	6959	@xref{Parser Function, ,The Parser Function @code{yyparse}}.
	6960	@end deffn
	6961
	6962	@deffn {Macro} YYBACKUP (@var{token}, @var{value})@code{;}
	6963	@findex YYBACKUP
	6964	Unshift a token. This macro is allowed only for rules that reduce
	6965	a single value, and only when there is no lookahead token.
	6966	It is also disallowed in GLR parsers.
	6967	It installs a lookahead token with token type @var{token} and
	6968	semantic value @var{value}; then it discards the value that was
	6969	going to be reduced by this rule.
	6970
	6971	If the macro is used when it is not valid, such as when there is
	6972	a lookahead token already, then it reports a syntax error with
	6973	a message @samp{cannot back up} and performs ordinary error
	6974	recovery.
	6975
	6976	In either case, the rest of the action is not executed.
	6977	@end deffn
	6978
	6979	@deffn {Macro} YYEMPTY
	6980	Value stored in @code{yychar} when there is no lookahead token.
	6981	@end deffn
	6982
	6983	@deffn {Macro} YYEOF
	6984	Value stored in @code{yychar} when the lookahead is the end of the input
	6985	stream.
	6986	@end deffn
	6987
	6988	@deffn {Macro} YYERROR @code{;}
	6989	Cause an immediate syntax error. This statement initiates error
	6990	recovery just as if the parser itself had detected an error; however, it
	6991	does not call @code{yyerror}, and does not print any message. If you
	6992	want to print an error message, call @code{yyerror} explicitly before
	6993	the @samp{YYERROR;} statement. @xref{Error Recovery}.
	6994	@end deffn
	6995
	6996	@deffn {Macro} YYRECOVERING
	6997	@findex YYRECOVERING
	6998	The expression @code{YYRECOVERING ()} yields 1 when the parser
	6999	is recovering from a syntax error, and 0 otherwise.
	7000	@xref{Error Recovery}.
	7001	@end deffn
	7002
	7003	@deffn {Variable} yychar
	7004	Variable containing either the lookahead token, or @code{YYEOF} when the
	7005	lookahead is the end of the input stream, or @code{YYEMPTY} when no lookahead
	7006	has been performed so the next token is not yet known.
	7007	Do not modify @code{yychar} in a deferred semantic action (@pxref{GLR Semantic
	7008	Actions}).
	7009	@xref{Lookahead, ,Lookahead Tokens}.
	7010	@end deffn
	7011
	7012	@deffn {Macro} yyclearin @code{;}
	7013	Discard the current lookahead token. This is useful primarily in
	7014	error rules.
	7015	Do not invoke @code{yyclearin} in a deferred semantic action (@pxref{GLR
	7016	Semantic Actions}).
	7017	@xref{Error Recovery}.
	7018	@end deffn
	7019
	7020	@deffn {Macro} yyerrok @code{;}
	7021	Resume generating error messages immediately for subsequent syntax
	7022	errors. This is useful primarily in error rules.
	7023	@xref{Error Recovery}.
	7024	@end deffn
	7025
	7026	@deffn {Variable} yylloc
	7027	Variable containing the lookahead token location when @code{yychar} is not set
	7028	to @code{YYEMPTY} or @code{YYEOF}.
	7029	Do not modify @code{yylloc} in a deferred semantic action (@pxref{GLR Semantic
	7030	Actions}).
	7031	@xref{Actions and Locations, ,Actions and Locations}.
	7032	@end deffn
	7033
	7034	@deffn {Variable} yylval
	7035	Variable containing the lookahead token semantic value when @code{yychar} is
	7036	not set to @code{YYEMPTY} or @code{YYEOF}.
	7037	Do not modify @code{yylval} in a deferred semantic action (@pxref{GLR Semantic
	7038	Actions}).
	7039	@xref{Actions, ,Actions}.
	7040	@end deffn
	7041
	7042	@deffn {Value} @@$
	7043	Acts like a structure variable containing information on the textual
	7044	location of the grouping made by the current rule. @xref{Tracking
	7045	Locations}.
	7046
	7047	@c Check if those paragraphs are still useful or not.
	7048
	7049	@c @example
	7050	@c struct @{
	7051	@c int first_line, last_line;
	7052	@c int first_column, last_column;
	7053	@c @};
	7054	@c @end example
	7055
	7056	@c Thus, to get the starting line number of the third component, you would
	7057	@c use @samp{@@3.first_line}.
	7058
	7059	@c In order for the members of this structure to contain valid information,
	7060	@c you must make @code{yylex} supply this information about each token.
	7061	@c If you need only certain members, then @code{yylex} need only fill in
	7062	@c those members.
	7063
	7064	@c The use of this feature makes the parser noticeably slower.
	7065	@end deffn
	7066
	7067	@deffn {Value} @@@var{n}
	7068	@findex @@@var{n}
	7069	Acts like a structure variable containing information on the textual
	7070	location of the @var{n}th component of the current rule. @xref{Tracking
	7071	Locations}.
	7072	@end deffn
	7073
	7074	@node Internationalization
	7075	@section Parser Internationalization
	7076	@cindex internationalization
	7077	@cindex i18n
	7078	@cindex NLS
	7079	@cindex gettext
	7080	@cindex bison-po
	7081
	7082	A Bison-generated parser can print diagnostics, including error and
	7083	tracing messages. By default, they appear in English. However, Bison
	7084	also supports outputting diagnostics in the user's native language. To
	7085	make this work, the user should set the usual environment variables.
	7086	@xref{Users, , The User's View, gettext, GNU @code{gettext} utilities}.
	7087	For example, the shell command @samp{export LC_ALL=fr_CA.UTF-8} might
	7088	set the user's locale to French Canadian using the UTF-8
	7089	encoding. The exact set of available locales depends on the user's
	7090	installation.
	7091
	7092	The maintainer of a package that uses a Bison-generated parser enables
	7093	the internationalization of the parser's output through the following
	7094	steps. Here we assume a package that uses GNU Autoconf and
	7095	GNU Automake.
	7096
	7097	@enumerate
	7098	@item
	7099	@cindex bison-i18n.m4
	7100	Into the directory containing the GNU Autoconf macros used
	7101	by the package ---often called @file{m4}--- copy the
	7102	@file{bison-i18n.m4} file installed by Bison under
	7103	@samp{share/aclocal/bison-i18n.m4} in Bison's installation directory.
	7104	For example:
	7105
	7106	@example
	7107	cp /usr/local/share/aclocal/bison-i18n.m4 m4/bison-i18n.m4
	7108	@end example
	7109
	7110	@item
	7111	@findex BISON_I18N
	7112	@vindex BISON_LOCALEDIR
	7113	@vindex YYENABLE_NLS
	7114	In the top-level @file{configure.ac}, after the @code{AM_GNU_GETTEXT}
	7115	invocation, add an invocation of @code{BISON_I18N}. This macro is
	7116	defined in the file @file{bison-i18n.m4} that you copied earlier. It
	7117	causes @samp{configure} to find the value of the
	7118	@code{BISON_LOCALEDIR} variable, and it defines the source-language
	7119	symbol @code{YYENABLE_NLS} to enable translations in the
	7120	Bison-generated parser.
	7121
	7122	@item
	7123	In the @code{main} function of your program, designate the directory
	7124	containing Bison's runtime message catalog, through a call to
	7125	@samp{bindtextdomain} with domain name @samp{bison-runtime}.
	7126	For example:
	7127
	7128	@example
	7129	bindtextdomain ("bison-runtime", BISON_LOCALEDIR);
	7130	@end example
	7131
	7132	Typically this appears after any other call @code{bindtextdomain
	7133	(PACKAGE, LOCALEDIR)} that your package already has. Here we rely on
	7134	@samp{BISON_LOCALEDIR} to be defined as a string through the
	7135	@file{Makefile}.
	7136
	7137	@item
	7138	In the @file{Makefile.am} that controls the compilation of the @code{main}
	7139	function, make @samp{BISON_LOCALEDIR} available as a C preprocessor macro,
	7140	either in @samp{DEFS} or in @samp{AM_CPPFLAGS}. For example:
	7141
	7142	@example
	7143	DEFS = @@DEFS@@ -DBISON_LOCALEDIR='"$(BISON_LOCALEDIR)"'
	7144	@end example
	7145
	7146	or:
	7147
	7148	@example
	7149	AM_CPPFLAGS = -DBISON_LOCALEDIR='"$(BISON_LOCALEDIR)"'
	7150	@end example
	7151
	7152	@item
	7153	Finally, invoke the command @command{autoreconf} to generate the build
	7154	infrastructure.
	7155	@end enumerate
	7156
	7157
	7158	@node Algorithm
	7159	@chapter The Bison Parser Algorithm
	7160	@cindex Bison parser algorithm
	7161	@cindex algorithm of parser
	7162	@cindex shifting
	7163	@cindex reduction
	7164	@cindex parser stack
	7165	@cindex stack, parser
	7166
	7167	As Bison reads tokens, it pushes them onto a stack along with their
	7168	semantic values. The stack is called the @dfn{parser stack}. Pushing a
	7169	token is traditionally called @dfn{shifting}.
	7170
	7171	For example, suppose the infix calculator has read @samp{1 + 5 *}, with a
	7172	@samp{3} to come. The stack will have four elements, one for each token
	7173	that was shifted.
	7174
	7175	But the stack does not always have an element for each token read. When
	7176	the last @var{n} tokens and groupings shifted match the components of a
	7177	grammar rule, they can be combined according to that rule. This is called
	7178	@dfn{reduction}. Those tokens and groupings are replaced on the stack by a
	7179	single grouping whose symbol is the result (left hand side) of that rule.
	7180	Running the rule's action is part of the process of reduction, because this
	7181	is what computes the semantic value of the resulting grouping.
	7182
	7183	For example, if the infix calculator's parser stack contains this:
	7184
	7185	@example
	7186	1 + 5 * 3
	7187	@end example
	7188
	7189	@noindent
	7190	and the next input token is a newline character, then the last three
	7191	elements can be reduced to 15 via the rule:
	7192
	7193	@example
	7194	expr: expr '*' expr;
	7195	@end example
	7196
	7197	@noindent
	7198	Then the stack contains just these three elements:
	7199
	7200	@example
	7201	1 + 15
	7202	@end example
	7203
	7204	@noindent
	7205	At this point, another reduction can be made, resulting in the single value
	7206	16. Then the newline token can be shifted.
	7207
	7208	The parser tries, by shifts and reductions, to reduce the entire input down
	7209	to a single grouping whose symbol is the grammar's start-symbol
	7210	(@pxref{Language and Grammar, ,Languages and Context-Free Grammars}).
	7211
	7212	This kind of parser is known in the literature as a bottom-up parser.
	7213
	7214	@menu
	7215	* Lookahead:: Parser looks one token ahead when deciding what to do.
	7216	* Shift/Reduce:: Conflicts: when either shifting or reduction is valid.
	7217	* Precedence:: Operator precedence works by resolving conflicts.
	7218	* Contextual Precedence:: When an operator's precedence depends on context.
	7219	* Parser States:: The parser is a finite-state-machine with stack.
	7220	* Reduce/Reduce:: When two rules are applicable in the same situation.
	7221	* Mysterious Conflicts:: Conflicts that look unjustified.
	7222	* Tuning LR:: How to tune fundamental aspects of LR-based parsing.
	7223	* Generalized LR Parsing:: Parsing arbitrary context-free grammars.
	7224	* Memory Management:: What happens when memory is exhausted. How to avoid it.
	7225	@end menu
	7226
	7227	@node Lookahead
	7228	@section Lookahead Tokens
	7229	@cindex lookahead token
	7230
	7231	The Bison parser does @emph{not} always reduce immediately as soon as the
	7232	last @var{n} tokens and groupings match a rule. This is because such a
	7233	simple strategy is inadequate to handle most languages. Instead, when a
	7234	reduction is possible, the parser sometimes ``looks ahead'' at the next
	7235	token in order to decide what to do.
	7236
	7237	When a token is read, it is not immediately shifted; first it becomes the
	7238	@dfn{lookahead token}, which is not on the stack. Now the parser can
	7239	perform one or more reductions of tokens and groupings on the stack, while
	7240	the lookahead token remains off to the side. When no more reductions
	7241	should take place, the lookahead token is shifted onto the stack. This
	7242	does not mean that all possible reductions have been done; depending on the
	7243	token type of the lookahead token, some rules may choose to delay their
	7244	application.
	7245
	7246	Here is a simple case where lookahead is needed. These three rules define
	7247	expressions which contain binary addition operators and postfix unary
	7248	factorial operators (@samp{!}), and allow parentheses for grouping.
	7249
	7250	@example
	7251	@group
	7252	expr:
	7253	term '+' expr
	7254	\| term
	7255	;
	7256	@end group
	7257
	7258	@group
	7259	term:
	7260	'(' expr ')'
	7261	\| term '!'
	7262	\| "number"
	7263	;
	7264	@end group
	7265	@end example
	7266
	7267	Suppose that the tokens @w{@samp{1 + 2}} have been read and shifted; what
	7268	should be done? If the following token is @samp{)}, then the first three
	7269	tokens must be reduced to form an @code{expr}. This is the only valid
	7270	course, because shifting the @samp{)} would produce a sequence of symbols
	7271	@w{@code{term ')'}}, and no rule allows this.
	7272
	7273	If the following token is @samp{!}, then it must be shifted immediately so
	7274	that @w{@samp{2 !}} can be reduced to make a @code{term}. If instead the
	7275	parser were to reduce before shifting, @w{@samp{1 + 2}} would become an
	7276	@code{expr}. It would then be impossible to shift the @samp{!} because
	7277	doing so would produce on the stack the sequence of symbols @code{expr
	7278	'!'}. No rule allows that sequence.
	7279
	7280	@vindex yychar
	7281	@vindex yylval
	7282	@vindex yylloc
	7283	The lookahead token is stored in the variable @code{yychar}.
	7284	Its semantic value and location, if any, are stored in the variables
	7285	@code{yylval} and @code{yylloc}.
	7286	@xref{Action Features, ,Special Features for Use in Actions}.
	7287
	7288	@node Shift/Reduce
	7289	@section Shift/Reduce Conflicts
	7290	@cindex conflicts
	7291	@cindex shift/reduce conflicts
	7292	@cindex dangling @code{else}
	7293	@cindex @code{else}, dangling
	7294
	7295	Suppose we are parsing a language which has if-then and if-then-else
	7296	statements, with a pair of rules like this:
	7297
	7298	@example
	7299	@group
	7300	if_stmt:
	7301	"if" expr "then" stmt
	7302	\| "if" expr "then" stmt "else" stmt
	7303	;
	7304	@end group
	7305	@end example
	7306
	7307	@noindent
	7308	Here @code{"if"}, @code{"then"} and @code{"else"} are terminal symbols for
	7309	specific keyword tokens.
	7310
	7311	When the @code{"else"} token is read and becomes the lookahead token, the
	7312	contents of the stack (assuming the input is valid) are just right for
	7313	reduction by the first rule. But it is also legitimate to shift the
	7314	@code{"else"}, because that would lead to eventual reduction by the second
	7315	rule.
	7316
	7317	This situation, where either a shift or a reduction would be valid, is
	7318	called a @dfn{shift/reduce conflict}. Bison is designed to resolve
	7319	these conflicts by choosing to shift, unless otherwise directed by
	7320	operator precedence declarations. To see the reason for this, let's
	7321	contrast it with the other alternative.
	7322
	7323	Since the parser prefers to shift the @code{"else"}, the result is to attach
	7324	the else-clause to the innermost if-statement, making these two inputs
	7325	equivalent:
	7326
	7327	@example
	7328	if x then if y then win; else lose;
	7329
	7330	if x then do; if y then win; else lose; end;
	7331	@end example
	7332
	7333	But if the parser chose to reduce when possible rather than shift, the
	7334	result would be to attach the else-clause to the outermost if-statement,
	7335	making these two inputs equivalent:
	7336
	7337	@example
	7338	if x then if y then win; else lose;
	7339
	7340	if x then do; if y then win; end; else lose;
	7341	@end example
	7342
	7343	The conflict exists because the grammar as written is ambiguous: either
	7344	parsing of the simple nested if-statement is legitimate. The established
	7345	convention is that these ambiguities are resolved by attaching the
	7346	else-clause to the innermost if-statement; this is what Bison accomplishes
	7347	by choosing to shift rather than reduce. (It would ideally be cleaner to
	7348	write an unambiguous grammar, but that is very hard to do in this case.)
	7349	This particular ambiguity was first encountered in the specifications of
	7350	Algol 60 and is called the ``dangling @code{else}'' ambiguity.
	7351
	7352	To avoid warnings from Bison about predictable, legitimate shift/reduce
	7353	conflicts, you can use the @code{%expect @var{n}} declaration.
	7354	There will be no warning as long as the number of shift/reduce conflicts
	7355	is exactly @var{n}, and Bison will report an error if there is a
	7356	different number.
	7357	@xref{Expect Decl, ,Suppressing Conflict Warnings}. However, we don't
	7358	recommend the use of @code{%expect} (except @samp{%expect 0}!), as an equal
	7359	number of conflicts does not mean that they are the @emph{same}. When
	7360	possible, you should rather use precedence directives to @emph{fix} the
	7361	conflicts explicitly (@pxref{Non Operators,, Using Precedence For Non
	7362	Operators}).
	7363
	7364	The definition of @code{if_stmt} above is solely to blame for the
	7365	conflict, but the conflict does not actually appear without additional
	7366	rules. Here is a complete Bison grammar file that actually manifests
	7367	the conflict:
	7368
	7369	@example
	7370	%%
	7371	@group
	7372	stmt:
	7373	expr
	7374	\| if_stmt
	7375	;
	7376	@end group
	7377
	7378	@group
	7379	if_stmt:
	7380	"if" expr "then" stmt
	7381	\| "if" expr "then" stmt "else" stmt
	7382	;
	7383	@end group
	7384
	7385	expr:
	7386	"identifier"
	7387	;
	7388	@end example
	7389
	7390	@node Precedence
	7391	@section Operator Precedence
	7392	@cindex operator precedence
	7393	@cindex precedence of operators
	7394
	7395	Another situation where shift/reduce conflicts appear is in arithmetic
	7396	expressions. Here shifting is not always the preferred resolution; the
	7397	Bison declarations for operator precedence allow you to specify when to
	7398	shift and when to reduce.
	7399
	7400	@menu
	7401	* Why Precedence:: An example showing why precedence is needed.
	7402	* Using Precedence:: How to specify precedence and associativity.
	7403	* Precedence Only:: How to specify precedence only.
	7404	* Precedence Examples:: How these features are used in the previous example.
	7405	* How Precedence:: How they work.
	7406	* Non Operators:: Using precedence for general conflicts.
	7407	@end menu
	7408
	7409	@node Why Precedence
	7410	@subsection When Precedence is Needed
	7411
	7412	Consider the following ambiguous grammar fragment (ambiguous because the
	7413	input @w{@samp{1 - 2 * 3}} can be parsed in two different ways):
	7414
	7415	@example
	7416	@group
	7417	expr:
	7418	expr '-' expr
	7419	\| expr '*' expr
	7420	\| expr '<' expr
	7421	\| '(' expr ')'
	7422	@dots{}
	7423	;
	7424	@end group
	7425	@end example
	7426
	7427	@noindent
	7428	Suppose the parser has seen the tokens @samp{1}, @samp{-} and @samp{2};
	7429	should it reduce them via the rule for the subtraction operator? It
	7430	depends on the next token. Of course, if the next token is @samp{)}, we
	7431	must reduce; shifting is invalid because no single rule can reduce the
	7432	token sequence @w{@samp{- 2 )}} or anything starting with that. But if
	7433	the next token is @samp{*} or @samp{<}, we have a choice: either
	7434	shifting or reduction would allow the parse to complete, but with
	7435	different results.
	7436
	7437	To decide which one Bison should do, we must consider the results. If
	7438	the next operator token @var{op} is shifted, then it must be reduced
	7439	first in order to permit another opportunity to reduce the difference.
	7440	The result is (in effect) @w{@samp{1 - (2 @var{op} 3)}}. On the other
	7441	hand, if the subtraction is reduced before shifting @var{op}, the result
	7442	is @w{@samp{(1 - 2) @var{op} 3}}. Clearly, then, the choice of shift or
	7443	reduce should depend on the relative precedence of the operators
	7444	@samp{-} and @var{op}: @samp{*} should be shifted first, but not
	7445	@samp{<}.
	7446
	7447	@cindex associativity
	7448	What about input such as @w{@samp{1 - 2 - 5}}; should this be
	7449	@w{@samp{(1 - 2) - 5}} or should it be @w{@samp{1 - (2 - 5)}}? For most
	7450	operators we prefer the former, which is called @dfn{left association}.
	7451	The latter alternative, @dfn{right association}, is desirable for
	7452	assignment operators. The choice of left or right association is a
	7453	matter of whether the parser chooses to shift or reduce when the stack
	7454	contains @w{@samp{1 - 2}} and the lookahead token is @samp{-}: shifting
	7455	makes right-associativity.
	7456
	7457	@node Using Precedence
	7458	@subsection Specifying Operator Precedence
	7459	@findex %left
	7460	@findex %nonassoc
	7461	@findex %precedence
	7462	@findex %right
	7463
	7464	Bison allows you to specify these choices with the operator precedence
	7465	declarations @code{%left} and @code{%right}. Each such declaration
	7466	contains a list of tokens, which are operators whose precedence and
	7467	associativity is being declared. The @code{%left} declaration makes all
	7468	those operators left-associative and the @code{%right} declaration makes
	7469	them right-associative. A third alternative is @code{%nonassoc}, which
	7470	declares that it is a syntax error to find the same operator twice ``in a
	7471	row''.
	7472	The last alternative, @code{%precedence}, allows to define only
	7473	precedence and no associativity at all. As a result, any
	7474	associativity-related conflict that remains will be reported as an
	7475	compile-time error. The directive @code{%nonassoc} creates run-time
	7476	error: using the operator in a associative way is a syntax error. The
	7477	directive @code{%precedence} creates compile-time errors: an operator
	7478	@emph{can} be involved in an associativity-related conflict, contrary to
	7479	what expected the grammar author.
	7480
	7481	The relative precedence of different operators is controlled by the
	7482	order in which they are declared. The first precedence/associativity
	7483	declaration in the file declares the operators whose
	7484	precedence is lowest, the next such declaration declares the operators
	7485	whose precedence is a little higher, and so on.
	7486
	7487	@node Precedence Only
	7488	@subsection Specifying Precedence Only
	7489	@findex %precedence
	7490
	7491	Since POSIX Yacc defines only @code{%left}, @code{%right}, and
	7492	@code{%nonassoc}, which all defines precedence and associativity, little
	7493	attention is paid to the fact that precedence cannot be defined without
	7494	defining associativity. Yet, sometimes, when trying to solve a
	7495	conflict, precedence suffices. In such a case, using @code{%left},
	7496	@code{%right}, or @code{%nonassoc} might hide future (associativity
	7497	related) conflicts that would remain hidden.
	7498
	7499	The dangling @code{else} ambiguity (@pxref{Shift/Reduce, , Shift/Reduce
	7500	Conflicts}) can be solved explicitly. This shift/reduce conflicts occurs
	7501	in the following situation, where the period denotes the current parsing
	7502	state:
	7503
	7504	@example
	7505	if @var{e1} then if @var{e2} then @var{s1} . else @var{s2}
	7506	@end example
	7507
	7508	The conflict involves the reduction of the rule @samp{IF expr THEN
	7509	stmt}, which precedence is by default that of its last token
	7510	(@code{THEN}), and the shifting of the token @code{ELSE}. The usual
	7511	disambiguation (attach the @code{else} to the closest @code{if}),
	7512	shifting must be preferred, i.e., the precedence of @code{ELSE} must be
	7513	higher than that of @code{THEN}. But neither is expected to be involved
	7514	in an associativity related conflict, which can be specified as follows.
	7515
	7516	@example
	7517	%precedence THEN
	7518	%precedence ELSE
	7519	@end example
	7520
	7521	The unary-minus is another typical example where associativity is
	7522	usually over-specified, see @ref{Infix Calc, , Infix Notation
	7523	Calculator: @code{calc}}. The @code{%left} directive is traditionally
	7524	used to declare the precedence of @code{NEG}, which is more than needed
	7525	since it also defines its associativity. While this is harmless in the
	7526	traditional example, who knows how @code{NEG} might be used in future
	7527	evolutions of the grammar@dots{}
	7528
	7529	@node Precedence Examples
	7530	@subsection Precedence Examples
	7531
	7532	In our example, we would want the following declarations:
	7533
	7534	@example
	7535	%left '<'
	7536	%left '-'
	7537	%left '*'
	7538	@end example
	7539
	7540	In a more complete example, which supports other operators as well, we
	7541	would declare them in groups of equal precedence. For example, @code{'+'} is
	7542	declared with @code{'-'}:
	7543
	7544	@example
	7545	%left '<' '>' '=' "!=" "<=" ">="
	7546	%left '+' '-'
	7547	%left '*' '/'
	7548	@end example
	7549
	7550	@node How Precedence
	7551	@subsection How Precedence Works
	7552
	7553	The first effect of the precedence declarations is to assign precedence
	7554	levels to the terminal symbols declared. The second effect is to assign
	7555	precedence levels to certain rules: each rule gets its precedence from
	7556	the last terminal symbol mentioned in the components. (You can also
	7557	specify explicitly the precedence of a rule. @xref{Contextual
	7558	Precedence, ,Context-Dependent Precedence}.)
	7559
	7560	Finally, the resolution of conflicts works by comparing the precedence
	7561	of the rule being considered with that of the lookahead token. If the
	7562	token's precedence is higher, the choice is to shift. If the rule's
	7563	precedence is higher, the choice is to reduce. If they have equal
	7564	precedence, the choice is made based on the associativity of that
	7565	precedence level. The verbose output file made by @samp{-v}
	7566	(@pxref{Invocation, ,Invoking Bison}) says how each conflict was
	7567	resolved.
	7568
	7569	Not all rules and not all tokens have precedence. If either the rule or
	7570	the lookahead token has no precedence, then the default is to shift.
	7571
	7572	@node Non Operators
	7573	@subsection Using Precedence For Non Operators
	7574
	7575	Using properly precedence and associativity directives can help fixing
	7576	shift/reduce conflicts that do not involve arithmetics-like operators. For
	7577	instance, the ``dangling @code{else}'' problem (@pxref{Shift/Reduce, ,
	7578	Shift/Reduce Conflicts}) can be solved elegantly in two different ways.
	7579
	7580	In the present case, the conflict is between the token @code{"else"} willing
	7581	to be shifted, and the rule @samp{if_stmt: "if" expr "then" stmt}, asking
	7582	for reduction. By default, the precedence of a rule is that of its last
	7583	token, here @code{"then"}, so the conflict will be solved appropriately
	7584	by giving @code{"else"} a precedence higher than that of @code{"then"}, for
	7585	instance as follows:
	7586
	7587	@example
	7588	@group
	7589	%precedence "then"
	7590	%precedence "else"
	7591	@end group
	7592	@end example
	7593
	7594	Alternatively, you may give both tokens the same precedence, in which case
	7595	associativity is used to solve the conflict. To preserve the shift action,
	7596	use right associativity:
	7597
	7598	@example
	7599	%right "then" "else"
	7600	@end example
	7601
	7602	Neither solution is perfect however. Since Bison does not provide, so far,
	7603	``scoped'' precedence, both force you to declare the precedence
	7604	of these keywords with respect to the other operators your grammar.
	7605	Therefore, instead of being warned about new conflicts you would be unaware
	7606	of (e.g., a shift/reduce conflict due to @samp{if test then 1 else 2 + 3}
	7607	being ambiguous: @samp{if test then 1 else (2 + 3)} or @samp{(if test then 1
	7608	else 2) + 3}?), the conflict will be already ``fixed''.
	7609
	7610	@node Contextual Precedence
	7611	@section Context-Dependent Precedence
	7612	@cindex context-dependent precedence
	7613	@cindex unary operator precedence
	7614	@cindex precedence, context-dependent
	7615	@cindex precedence, unary operator
	7616	@findex %prec
	7617
	7618	Often the precedence of an operator depends on the context. This sounds
	7619	outlandish at first, but it is really very common. For example, a minus
	7620	sign typically has a very high precedence as a unary operator, and a
	7621	somewhat lower precedence (lower than multiplication) as a binary operator.
	7622
	7623	The Bison precedence declarations
	7624	can only be used once for a given token; so a token has
	7625	only one precedence declared in this way. For context-dependent
	7626	precedence, you need to use an additional mechanism: the @code{%prec}
	7627	modifier for rules.
	7628
	7629	The @code{%prec} modifier declares the precedence of a particular rule by
	7630	specifying a terminal symbol whose precedence should be used for that rule.
	7631	It's not necessary for that symbol to appear otherwise in the rule. The
	7632	modifier's syntax is:
	7633
	7634	@example
	7635	%prec @var{terminal-symbol}
	7636	@end example
	7637
	7638	@noindent
	7639	and it is written after the components of the rule. Its effect is to
	7640	assign the rule the precedence of @var{terminal-symbol}, overriding
	7641	the precedence that would be deduced for it in the ordinary way. The
	7642	altered rule precedence then affects how conflicts involving that rule
	7643	are resolved (@pxref{Precedence, ,Operator Precedence}).
	7644
	7645	Here is how @code{%prec} solves the problem of unary minus. First, declare
	7646	a precedence for a fictitious terminal symbol named @code{UMINUS}. There
	7647	are no tokens of this type, but the symbol serves to stand for its
	7648	precedence:
	7649
	7650	@example
	7651	@dots{}
	7652	%left '+' '-'
	7653	%left '*'
	7654	%left UMINUS
	7655	@end example
	7656
	7657	Now the precedence of @code{UMINUS} can be used in specific rules:
	7658
	7659	@example
	7660	@group
	7661	exp:
	7662	@dots{}
	7663	\| exp '-' exp
	7664	@dots{}
	7665	\| '-' exp %prec UMINUS
	7666	@end group
	7667	@end example
	7668
	7669	@ifset defaultprec
	7670	If you forget to append @code{%prec UMINUS} to the rule for unary
	7671	minus, Bison silently assumes that minus has its usual precedence.
	7672	This kind of problem can be tricky to debug, since one typically
	7673	discovers the mistake only by testing the code.
	7674
	7675	The @code{%no-default-prec;} declaration makes it easier to discover
	7676	this kind of problem systematically. It causes rules that lack a
	7677	@code{%prec} modifier to have no precedence, even if the last terminal
	7678	symbol mentioned in their components has a declared precedence.
	7679
	7680	If @code{%no-default-prec;} is in effect, you must specify @code{%prec}
	7681	for all rules that participate in precedence conflict resolution.
	7682	Then you will see any shift/reduce conflict until you tell Bison how
	7683	to resolve it, either by changing your grammar or by adding an
	7684	explicit precedence. This will probably add declarations to the
	7685	grammar, but it helps to protect against incorrect rule precedences.
	7686
	7687	The effect of @code{%no-default-prec;} can be reversed by giving
	7688	@code{%default-prec;}, which is the default.
	7689	@end ifset
	7690
	7691	@node Parser States
	7692	@section Parser States
	7693	@cindex finite-state machine
	7694	@cindex parser state
	7695	@cindex state (of parser)
	7696
	7697	The function @code{yyparse} is implemented using a finite-state machine.
	7698	The values pushed on the parser stack are not simply token type codes; they
	7699	represent the entire sequence of terminal and nonterminal symbols at or
	7700	near the top of the stack. The current state collects all the information
	7701	about previous input which is relevant to deciding what to do next.
	7702
	7703	Each time a lookahead token is read, the current parser state together
	7704	with the type of lookahead token are looked up in a table. This table
	7705	entry can say, ``Shift the lookahead token.'' In this case, it also
	7706	specifies the new parser state, which is pushed onto the top of the
	7707	parser stack. Or it can say, ``Reduce using rule number @var{n}.''
	7708	This means that a certain number of tokens or groupings are taken off
	7709	the top of the stack, and replaced by one grouping. In other words,
	7710	that number of states are popped from the stack, and one new state is
	7711	pushed.
	7712
	7713	There is one other alternative: the table can say that the lookahead token
	7714	is erroneous in the current state. This causes error processing to begin
	7715	(@pxref{Error Recovery}).
	7716
	7717	@node Reduce/Reduce
	7718	@section Reduce/Reduce Conflicts
	7719	@cindex reduce/reduce conflict
	7720	@cindex conflicts, reduce/reduce
	7721
	7722	A reduce/reduce conflict occurs if there are two or more rules that apply
	7723	to the same sequence of input. This usually indicates a serious error
	7724	in the grammar.
	7725
	7726	For example, here is an erroneous attempt to define a sequence
	7727	of zero or more @code{word} groupings.
	7728
	7729	@example
	7730	@group
	7731	sequence:
	7732	%empty @{ printf ("empty sequence\n"); @}
	7733	\| maybeword
	7734	\| sequence word @{ printf ("added word %s\n", $2); @}
	7735	;
	7736	@end group
	7737
	7738	@group
	7739	maybeword:
	7740	%empty @{ printf ("empty maybeword\n"); @}
	7741	\| word @{ printf ("single word %s\n", $1); @}
	7742	;
	7743	@end group
	7744	@end example
	7745
	7746	@noindent
	7747	The error is an ambiguity: there is more than one way to parse a single
	7748	@code{word} into a @code{sequence}. It could be reduced to a
	7749	@code{maybeword} and then into a @code{sequence} via the second rule.
	7750	Alternatively, nothing-at-all could be reduced into a @code{sequence}
	7751	via the first rule, and this could be combined with the @code{word}
	7752	using the third rule for @code{sequence}.
	7753
	7754	There is also more than one way to reduce nothing-at-all into a
	7755	@code{sequence}. This can be done directly via the first rule,
	7756	or indirectly via @code{maybeword} and then the second rule.
	7757
	7758	You might think that this is a distinction without a difference, because it
	7759	does not change whether any particular input is valid or not. But it does
	7760	affect which actions are run. One parsing order runs the second rule's
	7761	action; the other runs the first rule's action and the third rule's action.
	7762	In this example, the output of the program changes.
	7763
	7764	Bison resolves a reduce/reduce conflict by choosing to use the rule that
	7765	appears first in the grammar, but it is very risky to rely on this. Every
	7766	reduce/reduce conflict must be studied and usually eliminated. Here is the
	7767	proper way to define @code{sequence}:
	7768
	7769	@example
	7770	@group
	7771	sequence:
	7772	%empty @{ printf ("empty sequence\n"); @}
	7773	\| sequence word @{ printf ("added word %s\n", $2); @}
	7774	;
	7775	@end group
	7776	@end example
	7777
	7778	Here is another common error that yields a reduce/reduce conflict:
	7779
	7780	@example
	7781	@group
	7782	sequence:
	7783	%empty
	7784	\| sequence words
	7785	\| sequence redirects
	7786	;
	7787	@end group
	7788
	7789	@group
	7790	words:
	7791	%empty
	7792	\| words word
	7793	;
	7794	@end group
	7795
	7796	@group
	7797	redirects:
	7798	%empty
	7799	\| redirects redirect
	7800	;
	7801	@end group
	7802	@end example
	7803
	7804	@noindent
	7805	The intention here is to define a sequence which can contain either
	7806	@code{word} or @code{redirect} groupings. The individual definitions of
	7807	@code{sequence}, @code{words} and @code{redirects} are error-free, but the
	7808	three together make a subtle ambiguity: even an empty input can be parsed
	7809	in infinitely many ways!
	7810
	7811	Consider: nothing-at-all could be a @code{words}. Or it could be two
	7812	@code{words} in a row, or three, or any number. It could equally well be a
	7813	@code{redirects}, or two, or any number. Or it could be a @code{words}
	7814	followed by three @code{redirects} and another @code{words}. And so on.
	7815
	7816	Here are two ways to correct these rules. First, to make it a single level
	7817	of sequence:
	7818
	7819	@example
	7820	sequence:
	7821	%empty
	7822	\| sequence word
	7823	\| sequence redirect
	7824	;
	7825	@end example
	7826
	7827	Second, to prevent either a @code{words} or a @code{redirects}
	7828	from being empty:
	7829
	7830	@example
	7831	@group
	7832	sequence:
	7833	%empty
	7834	\| sequence words
	7835	\| sequence redirects
	7836	;
	7837	@end group
	7838
	7839	@group
	7840	words:
	7841	word
	7842	\| words word
	7843	;
	7844	@end group
	7845
	7846	@group
	7847	redirects:
	7848	redirect
	7849	\| redirects redirect
	7850	;
	7851	@end group
	7852	@end example
	7853
	7854	Yet this proposal introduces another kind of ambiguity! The input
	7855	@samp{word word} can be parsed as a single @code{words} composed of two
	7856	@samp{word}s, or as two one-@code{word} @code{words} (and likewise for
	7857	@code{redirect}/@code{redirects}). However this ambiguity is now a
	7858	shift/reduce conflict, and therefore it can now be addressed with precedence
	7859	directives.
	7860
	7861	To simplify the matter, we will proceed with @code{word} and @code{redirect}
	7862	being tokens: @code{"word"} and @code{"redirect"}.
	7863
	7864	To prefer the longest @code{words}, the conflict between the token
	7865	@code{"word"} and the rule @samp{sequence: sequence words} must be resolved
	7866	as a shift. To this end, we use the same techniques as exposed above, see
	7867	@ref{Non Operators,, Using Precedence For Non Operators}. One solution
	7868	relies on precedences: use @code{%prec} to give a lower precedence to the
	7869	rule:
	7870
	7871	@example
	7872	%precedence "word"
	7873	%precedence "sequence"
	7874	%%
	7875	@group
	7876	sequence:
	7877	%empty
	7878	\| sequence word %prec "sequence"
	7879	\| sequence redirect %prec "sequence"
	7880	;
	7881	@end group
	7882
	7883	@group
	7884	words:
	7885	word
	7886	\| words "word"
	7887	;
	7888	@end group
	7889	@end example
	7890
	7891	Another solution relies on associativity: provide both the token and the
	7892	rule with the same precedence, but make them right-associative:
	7893
	7894	@example
	7895	%right "word" "redirect"
	7896	%%
	7897	@group
	7898	sequence:
	7899	%empty
	7900	\| sequence word %prec "word"
	7901	\| sequence redirect %prec "redirect"
	7902	;
	7903	@end group
	7904	@end example
	7905
	7906	@node Mysterious Conflicts
	7907	@section Mysterious Conflicts
	7908	@cindex Mysterious Conflicts
	7909
	7910	Sometimes reduce/reduce conflicts can occur that don't look warranted.
	7911	Here is an example:
	7912
	7913	@example
	7914	@group
	7915	%%
	7916	def: param_spec return_spec ',';
	7917	param_spec:
	7918	type
	7919	\| name_list ':' type
	7920	;
	7921	@end group
	7922
	7923	@group
	7924	return_spec:
	7925	type
	7926	\| name ':' type
	7927	;
	7928	@end group
	7929
	7930	type: "id";
	7931
	7932	@group
	7933	name: "id";
	7934	name_list:
	7935	name
	7936	\| name ',' name_list
	7937	;
	7938	@end group
	7939	@end example
	7940
	7941	It would seem that this grammar can be parsed with only a single token of
	7942	lookahead: when a @code{param_spec} is being read, an @code{"id"} is a
	7943	@code{name} if a comma or colon follows, or a @code{type} if another
	7944	@code{"id"} follows. In other words, this grammar is LR(1).
	7945
	7946	@cindex LR
	7947	@cindex LALR
	7948	However, for historical reasons, Bison cannot by default handle all
	7949	LR(1) grammars.
	7950	In this grammar, two contexts, that after an @code{"id"} at the beginning
	7951	of a @code{param_spec} and likewise at the beginning of a
	7952	@code{return_spec}, are similar enough that Bison assumes they are the
	7953	same.
	7954	They appear similar because the same set of rules would be
	7955	active---the rule for reducing to a @code{name} and that for reducing to
	7956	a @code{type}. Bison is unable to determine at that stage of processing
	7957	that the rules would require different lookahead tokens in the two
	7958	contexts, so it makes a single parser state for them both. Combining
	7959	the two contexts causes a conflict later. In parser terminology, this
	7960	occurrence means that the grammar is not LALR(1).
	7961
	7962	@cindex IELR
	7963	@cindex canonical LR
	7964	For many practical grammars (specifically those that fall into the non-LR(1)
	7965	class), the limitations of LALR(1) result in difficulties beyond just
	7966	mysterious reduce/reduce conflicts. The best way to fix all these problems
	7967	is to select a different parser table construction algorithm. Either
	7968	IELR(1) or canonical LR(1) would suffice, but the former is more efficient
	7969	and easier to debug during development. @xref{LR Table Construction}, for
	7970	details. (Bison's IELR(1) and canonical LR(1) implementations are
	7971	experimental. More user feedback will help to stabilize them.)
	7972
	7973	If you instead wish to work around LALR(1)'s limitations, you
	7974	can often fix a mysterious conflict by identifying the two parser states
	7975	that are being confused, and adding something to make them look
	7976	distinct. In the above example, adding one rule to
	7977	@code{return_spec} as follows makes the problem go away:
	7978
	7979	@example
	7980	@group
	7981	@dots{}
	7982	return_spec:
	7983	type
	7984	\| name ':' type
	7985	\| "id" "bogus" /* This rule is never used. */
	7986	;
	7987	@end group
	7988	@end example
	7989
	7990	This corrects the problem because it introduces the possibility of an
	7991	additional active rule in the context after the @code{"id"} at the beginning of
	7992	@code{return_spec}. This rule is not active in the corresponding context
	7993	in a @code{param_spec}, so the two contexts receive distinct parser states.
	7994	As long as the token @code{"bogus"} is never generated by @code{yylex},
	7995	the added rule cannot alter the way actual input is parsed.
	7996
	7997	In this particular example, there is another way to solve the problem:
	7998	rewrite the rule for @code{return_spec} to use @code{"id"} directly
	7999	instead of via @code{name}. This also causes the two confusing
	8000	contexts to have different sets of active rules, because the one for
	8001	@code{return_spec} activates the altered rule for @code{return_spec}
	8002	rather than the one for @code{name}.
	8003
	8004	@example
	8005	@group
	8006	param_spec:
	8007	type
	8008	\| name_list ':' type
	8009	;
	8010	@end group
	8011
	8012	@group
	8013	return_spec:
	8014	type
	8015	\| "id" ':' type
	8016	;
	8017	@end group
	8018	@end example
	8019
	8020	For a more detailed exposition of LALR(1) parsers and parser
	8021	generators, @pxref{Bibliography,,DeRemer 1982}.
	8022
	8023	@node Tuning LR
	8024	@section Tuning LR
	8025
	8026	The default behavior of Bison's LR-based parsers is chosen mostly for
	8027	historical reasons, but that behavior is often not robust. For example, in
	8028	the previous section, we discussed the mysterious conflicts that can be
	8029	produced by LALR(1), Bison's default parser table construction algorithm.
	8030	Another example is Bison's @code{%define parse.error verbose} directive,
	8031	which instructs the generated parser to produce verbose syntax error
	8032	messages, which can sometimes contain incorrect information.
	8033
	8034	In this section, we explore several modern features of Bison that allow you
	8035	to tune fundamental aspects of the generated LR-based parsers. Some of
	8036	these features easily eliminate shortcomings like those mentioned above.
	8037	Others can be helpful purely for understanding your parser.
	8038
	8039	Most of the features discussed in this section are still experimental. More
	8040	user feedback will help to stabilize them.
	8041
	8042	@menu
	8043	* LR Table Construction:: Choose a different construction algorithm.
	8044	* Default Reductions:: Disable default reductions.
	8045	* LAC:: Correct lookahead sets in the parser states.
	8046	* Unreachable States:: Keep unreachable parser states for debugging.
	8047	@end menu
	8048
	8049	@node LR Table Construction
	8050	@subsection LR Table Construction
	8051	@cindex Mysterious Conflict
	8052	@cindex LALR
	8053	@cindex IELR
	8054	@cindex canonical LR
	8055	@findex %define lr.type
	8056
	8057	For historical reasons, Bison constructs LALR(1) parser tables by default.
	8058	However, LALR does not possess the full language-recognition power of LR.
	8059	As a result, the behavior of parsers employing LALR parser tables is often
	8060	mysterious. We presented a simple example of this effect in @ref{Mysterious
	8061	Conflicts}.
	8062
	8063	As we also demonstrated in that example, the traditional approach to
	8064	eliminating such mysterious behavior is to restructure the grammar.
	8065	Unfortunately, doing so correctly is often difficult. Moreover, merely
	8066	discovering that LALR causes mysterious behavior in your parser can be
	8067	difficult as well.
	8068
	8069	Fortunately, Bison provides an easy way to eliminate the possibility of such
	8070	mysterious behavior altogether. You simply need to activate a more powerful
	8071	parser table construction algorithm by using the @code{%define lr.type}
	8072	directive.
	8073
	8074	@deffn {Directive} {%define lr.type} @var{type}
	8075	Specify the type of parser tables within the LR(1) family. The accepted
	8076	values for @var{type} are:
	8077
	8078	@itemize
	8079	@item @code{lalr} (default)
	8080	@item @code{ielr}
	8081	@item @code{canonical-lr}
	8082	@end itemize
	8083
	8084	(This feature is experimental. More user feedback will help to stabilize
	8085	it.)
	8086	@end deffn
	8087
	8088	For example, to activate IELR, you might add the following directive to you
	8089	grammar file:
	8090
	8091	@example
	8092	%define lr.type ielr
	8093	@end example
	8094
	8095	@noindent For the example in @ref{Mysterious Conflicts}, the mysterious
	8096	conflict is then eliminated, so there is no need to invest time in
	8097	comprehending the conflict or restructuring the grammar to fix it. If,
	8098	during future development, the grammar evolves such that all mysterious
	8099	behavior would have disappeared using just LALR, you need not fear that
	8100	continuing to use IELR will result in unnecessarily large parser tables.
	8101	That is, IELR generates LALR tables when LALR (using a deterministic parsing
	8102	algorithm) is sufficient to support the full language-recognition power of
	8103	LR. Thus, by enabling IELR at the start of grammar development, you can
	8104	safely and completely eliminate the need to consider LALR's shortcomings.
	8105
	8106	While IELR is almost always preferable, there are circumstances where LALR
	8107	or the canonical LR parser tables described by Knuth
	8108	(@pxref{Bibliography,,Knuth 1965}) can be useful. Here we summarize the
	8109	relative advantages of each parser table construction algorithm within
	8110	Bison:
	8111
	8112	@itemize
	8113	@item LALR
	8114
	8115	There are at least two scenarios where LALR can be worthwhile:
	8116
	8117	@itemize
	8118	@item GLR without static conflict resolution.
	8119
	8120	@cindex GLR with LALR
	8121	When employing GLR parsers (@pxref{GLR Parsers}), if you do not resolve any
	8122	conflicts statically (for example, with @code{%left} or @code{%precedence}),
	8123	then
	8124	the parser explores all potential parses of any given input. In this case,
	8125	the choice of parser table construction algorithm is guaranteed not to alter
	8126	the language accepted by the parser. LALR parser tables are the smallest
	8127	parser tables Bison can currently construct, so they may then be preferable.
	8128	Nevertheless, once you begin to resolve conflicts statically, GLR behaves
	8129	more like a deterministic parser in the syntactic contexts where those
	8130	conflicts appear, and so either IELR or canonical LR can then be helpful to
	8131	avoid LALR's mysterious behavior.
	8132
	8133	@item Malformed grammars.
	8134
	8135	Occasionally during development, an especially malformed grammar with a
	8136	major recurring flaw may severely impede the IELR or canonical LR parser
	8137	table construction algorithm. LALR can be a quick way to construct parser
	8138	tables in order to investigate such problems while ignoring the more subtle
	8139	differences from IELR and canonical LR.
	8140	@end itemize
	8141
	8142	@item IELR
	8143
	8144	IELR (Inadequacy Elimination LR) is a minimal LR algorithm. That is, given
	8145	any grammar (LR or non-LR), parsers using IELR or canonical LR parser tables
	8146	always accept exactly the same set of sentences. However, like LALR, IELR
	8147	merges parser states during parser table construction so that the number of
	8148	parser states is often an order of magnitude less than for canonical LR.
	8149	More importantly, because canonical LR's extra parser states may contain
	8150	duplicate conflicts in the case of non-LR grammars, the number of conflicts
	8151	for IELR is often an order of magnitude less as well. This effect can
	8152	significantly reduce the complexity of developing a grammar.
	8153
	8154	@item Canonical LR
	8155
	8156	@cindex delayed syntax error detection
	8157	@cindex LAC
	8158	@findex %nonassoc
	8159	While inefficient, canonical LR parser tables can be an interesting means to
	8160	explore a grammar because they possess a property that IELR and LALR tables
	8161	do not. That is, if @code{%nonassoc} is not used and default reductions are
	8162	left disabled (@pxref{Default Reductions}), then, for every left context of
	8163	every canonical LR state, the set of tokens accepted by that state is
	8164	guaranteed to be the exact set of tokens that is syntactically acceptable in
	8165	that left context. It might then seem that an advantage of canonical LR
	8166	parsers in production is that, under the above constraints, they are
	8167	guaranteed to detect a syntax error as soon as possible without performing
	8168	any unnecessary reductions. However, IELR parsers that use LAC are also
	8169	able to achieve this behavior without sacrificing @code{%nonassoc} or
	8170	default reductions. For details and a few caveats of LAC, @pxref{LAC}.
	8171	@end itemize
	8172
	8173	For a more detailed exposition of the mysterious behavior in LALR parsers
	8174	and the benefits of IELR, @pxref{Bibliography,,Denny 2008 March}, and
	8175	@ref{Bibliography,,Denny 2010 November}.
	8176
	8177	@node Default Reductions
	8178	@subsection Default Reductions
	8179	@cindex default reductions
	8180	@findex %define lr.default-reduction
	8181	@findex %nonassoc
	8182
	8183	After parser table construction, Bison identifies the reduction with the
	8184	largest lookahead set in each parser state. To reduce the size of the
	8185	parser state, traditional Bison behavior is to remove that lookahead set and
	8186	to assign that reduction to be the default parser action. Such a reduction
	8187	is known as a @dfn{default reduction}.
	8188
	8189	Default reductions affect more than the size of the parser tables. They
	8190	also affect the behavior of the parser:
	8191
	8192	@itemize
	8193	@item Delayed @code{yylex} invocations.
	8194
	8195	@cindex delayed yylex invocations
	8196	@cindex consistent states
	8197	@cindex defaulted states
	8198	A @dfn{consistent state} is a state that has only one possible parser
	8199	action. If that action is a reduction and is encoded as a default
	8200	reduction, then that consistent state is called a @dfn{defaulted state}.
	8201	Upon reaching a defaulted state, a Bison-generated parser does not bother to
	8202	invoke @code{yylex} to fetch the next token before performing the reduction.
	8203	In other words, whether default reductions are enabled in consistent states
	8204	determines how soon a Bison-generated parser invokes @code{yylex} for a
	8205	token: immediately when it @emph{reaches} that token in the input or when it
	8206	eventually @emph{needs} that token as a lookahead to determine the next
	8207	parser action. Traditionally, default reductions are enabled, and so the
	8208	parser exhibits the latter behavior.
	8209
	8210	The presence of defaulted states is an important consideration when
	8211	designing @code{yylex} and the grammar file. That is, if the behavior of
	8212	@code{yylex} can influence or be influenced by the semantic actions
	8213	associated with the reductions in defaulted states, then the delay of the
	8214	next @code{yylex} invocation until after those reductions is significant.
	8215	For example, the semantic actions might pop a scope stack that @code{yylex}
	8216	uses to determine what token to return. Thus, the delay might be necessary
	8217	to ensure that @code{yylex} does not look up the next token in a scope that
	8218	should already be considered closed.
	8219
	8220	@item Delayed syntax error detection.
	8221
	8222	@cindex delayed syntax error detection
	8223	When the parser fetches a new token by invoking @code{yylex}, it checks
	8224	whether there is an action for that token in the current parser state. The
	8225	parser detects a syntax error if and only if either (1) there is no action
	8226	for that token or (2) the action for that token is the error action (due to
	8227	the use of @code{%nonassoc}). However, if there is a default reduction in
	8228	that state (which might or might not be a defaulted state), then it is
	8229	impossible for condition 1 to exist. That is, all tokens have an action.
	8230	Thus, the parser sometimes fails to detect the syntax error until it reaches
	8231	a later state.
	8232
	8233	@cindex LAC
	8234	@c If there's an infinite loop, default reductions can prevent an incorrect
	8235	@c sentence from being rejected.
	8236	While default reductions never cause the parser to accept syntactically
	8237	incorrect sentences, the delay of syntax error detection can have unexpected
	8238	effects on the behavior of the parser. However, the delay can be caused
	8239	anyway by parser state merging and the use of @code{%nonassoc}, and it can
	8240	be fixed by another Bison feature, LAC. We discuss the effects of delayed
	8241	syntax error detection and LAC more in the next section (@pxref{LAC}).
	8242	@end itemize
	8243
	8244	For canonical LR, the only default reduction that Bison enables by default
	8245	is the accept action, which appears only in the accepting state, which has
	8246	no other action and is thus a defaulted state. However, the default accept
	8247	action does not delay any @code{yylex} invocation or syntax error detection
	8248	because the accept action ends the parse.
	8249
	8250	For LALR and IELR, Bison enables default reductions in nearly all states by
	8251	default. There are only two exceptions. First, states that have a shift
	8252	action on the @code{error} token do not have default reductions because
	8253	delayed syntax error detection could then prevent the @code{error} token
	8254	from ever being shifted in that state. However, parser state merging can
	8255	cause the same effect anyway, and LAC fixes it in both cases, so future
	8256	versions of Bison might drop this exception when LAC is activated. Second,
	8257	GLR parsers do not record the default reduction as the action on a lookahead
	8258	token for which there is a conflict. The correct action in this case is to
	8259	split the parse instead.
	8260
	8261	To adjust which states have default reductions enabled, use the
	8262	@code{%define lr.default-reduction} directive.
	8263
	8264	@deffn {Directive} {%define lr.default-reduction} @var{where}
	8265	Specify the kind of states that are permitted to contain default reductions.
	8266	The accepted values of @var{where} are:
	8267	@itemize
	8268	@item @code{most} (default for LALR and IELR)
	8269	@item @code{consistent}
	8270	@item @code{accepting} (default for canonical LR)
	8271	@end itemize
	8272
	8273	(The ability to specify where default reductions are permitted is
	8274	experimental. More user feedback will help to stabilize it.)
	8275	@end deffn
	8276
	8277	@node LAC
	8278	@subsection LAC
	8279	@findex %define parse.lac
	8280	@cindex LAC
	8281	@cindex lookahead correction
	8282
	8283	Canonical LR, IELR, and LALR can suffer from a couple of problems upon
	8284	encountering a syntax error. First, the parser might perform additional
	8285	parser stack reductions before discovering the syntax error. Such
	8286	reductions can perform user semantic actions that are unexpected because
	8287	they are based on an invalid token, and they cause error recovery to begin
	8288	in a different syntactic context than the one in which the invalid token was
	8289	encountered. Second, when verbose error messages are enabled (@pxref{Error
	8290	Reporting}), the expected token list in the syntax error message can both
	8291	contain invalid tokens and omit valid tokens.
	8292
	8293	The culprits for the above problems are @code{%nonassoc}, default reductions
	8294	in inconsistent states (@pxref{Default Reductions}), and parser state
	8295	merging. Because IELR and LALR merge parser states, they suffer the most.
	8296	Canonical LR can suffer only if @code{%nonassoc} is used or if default
	8297	reductions are enabled for inconsistent states.
	8298
	8299	LAC (Lookahead Correction) is a new mechanism within the parsing algorithm
	8300	that solves these problems for canonical LR, IELR, and LALR without
	8301	sacrificing @code{%nonassoc}, default reductions, or state merging. You can
	8302	enable LAC with the @code{%define parse.lac} directive.
	8303
	8304	@deffn {Directive} {%define parse.lac} @var{value}
	8305	Enable LAC to improve syntax error handling.
	8306	@itemize
	8307	@item @code{none} (default)
	8308	@item @code{full}
	8309	@end itemize
	8310	(This feature is experimental. More user feedback will help to stabilize
	8311	it. Moreover, it is currently only available for deterministic parsers in
	8312	C.)
	8313	@end deffn
	8314
	8315	Conceptually, the LAC mechanism is straight-forward. Whenever the parser
	8316	fetches a new token from the scanner so that it can determine the next
	8317	parser action, it immediately suspends normal parsing and performs an
	8318	exploratory parse using a temporary copy of the normal parser state stack.
	8319	During this exploratory parse, the parser does not perform user semantic
	8320	actions. If the exploratory parse reaches a shift action, normal parsing
	8321	then resumes on the normal parser stacks. If the exploratory parse reaches
	8322	an error instead, the parser reports a syntax error. If verbose syntax
	8323	error messages are enabled, the parser must then discover the list of
	8324	expected tokens, so it performs a separate exploratory parse for each token
	8325	in the grammar.
	8326
	8327	There is one subtlety about the use of LAC. That is, when in a consistent
	8328	parser state with a default reduction, the parser will not attempt to fetch
	8329	a token from the scanner because no lookahead is needed to determine the
	8330	next parser action. Thus, whether default reductions are enabled in
	8331	consistent states (@pxref{Default Reductions}) affects how soon the parser
	8332	detects a syntax error: immediately when it @emph{reaches} an erroneous
	8333	token or when it eventually @emph{needs} that token as a lookahead to
	8334	determine the next parser action. The latter behavior is probably more
	8335	intuitive, so Bison currently provides no way to achieve the former behavior
	8336	while default reductions are enabled in consistent states.
	8337
	8338	Thus, when LAC is in use, for some fixed decision of whether to enable
	8339	default reductions in consistent states, canonical LR and IELR behave almost
	8340	exactly the same for both syntactically acceptable and syntactically
	8341	unacceptable input. While LALR still does not support the full
	8342	language-recognition power of canonical LR and IELR, LAC at least enables
	8343	LALR's syntax error handling to correctly reflect LALR's
	8344	language-recognition power.
	8345
	8346	There are a few caveats to consider when using LAC:
	8347
	8348	@itemize
	8349	@item Infinite parsing loops.
	8350
	8351	IELR plus LAC does have one shortcoming relative to canonical LR. Some
	8352	parsers generated by Bison can loop infinitely. LAC does not fix infinite
	8353	parsing loops that occur between encountering a syntax error and detecting
	8354	it, but enabling canonical LR or disabling default reductions sometimes
	8355	does.
	8356
	8357	@item Verbose error message limitations.
	8358
	8359	Because of internationalization considerations, Bison-generated parsers
	8360	limit the size of the expected token list they are willing to report in a
	8361	verbose syntax error message. If the number of expected tokens exceeds that
	8362	limit, the list is simply dropped from the message. Enabling LAC can
	8363	increase the size of the list and thus cause the parser to drop it. Of
	8364	course, dropping the list is better than reporting an incorrect list.
	8365
	8366	@item Performance.
	8367
	8368	Because LAC requires many parse actions to be performed twice, it can have a
	8369	performance penalty. However, not all parse actions must be performed
	8370	twice. Specifically, during a series of default reductions in consistent
	8371	states and shift actions, the parser never has to initiate an exploratory
	8372	parse. Moreover, the most time-consuming tasks in a parse are often the
	8373	file I/O, the lexical analysis performed by the scanner, and the user's
	8374	semantic actions, but none of these are performed during the exploratory
	8375	parse. Finally, the base of the temporary stack used during an exploratory
	8376	parse is a pointer into the normal parser state stack so that the stack is
	8377	never physically copied. In our experience, the performance penalty of LAC
	8378	has proved insignificant for practical grammars.
	8379	@end itemize
	8380
	8381	While the LAC algorithm shares techniques that have been recognized in the
	8382	parser community for years, for the publication that introduces LAC,
	8383	@pxref{Bibliography,,Denny 2010 May}.
	8384
	8385	@node Unreachable States
	8386	@subsection Unreachable States
	8387	@findex %define lr.keep-unreachable-state
	8388	@cindex unreachable states
	8389
	8390	If there exists no sequence of transitions from the parser's start state to
	8391	some state @var{s}, then Bison considers @var{s} to be an @dfn{unreachable
	8392	state}. A state can become unreachable during conflict resolution if Bison
	8393	disables a shift action leading to it from a predecessor state.
	8394
	8395	By default, Bison removes unreachable states from the parser after conflict
	8396	resolution because they are useless in the generated parser. However,
	8397	keeping unreachable states is sometimes useful when trying to understand the
	8398	relationship between the parser and the grammar.
	8399
	8400	@deffn {Directive} {%define lr.keep-unreachable-state} @var{value}
	8401	Request that Bison allow unreachable states to remain in the parser tables.
	8402	@var{value} must be a Boolean. The default is @code{false}.
	8403	@end deffn
	8404
	8405	There are a few caveats to consider:
	8406
	8407	@itemize @bullet
	8408	@item Missing or extraneous warnings.
	8409
	8410	Unreachable states may contain conflicts and may use rules not used in any
	8411	other state. Thus, keeping unreachable states may induce warnings that are
	8412	irrelevant to your parser's behavior, and it may eliminate warnings that are
	8413	relevant. Of course, the change in warnings may actually be relevant to a
	8414	parser table analysis that wants to keep unreachable states, so this
	8415	behavior will likely remain in future Bison releases.
	8416
	8417	@item Other useless states.
	8418
	8419	While Bison is able to remove unreachable states, it is not guaranteed to
	8420	remove other kinds of useless states. Specifically, when Bison disables
	8421	reduce actions during conflict resolution, some goto actions may become
	8422	useless, and thus some additional states may become useless. If Bison were
	8423	to compute which goto actions were useless and then disable those actions,
	8424	it could identify such states as unreachable and then remove those states.
	8425	However, Bison does not compute which goto actions are useless.
	8426	@end itemize
	8427
	8428	@node Generalized LR Parsing
	8429	@section Generalized LR (GLR) Parsing
	8430	@cindex GLR parsing
	8431	@cindex generalized LR (GLR) parsing
	8432	@cindex ambiguous grammars
	8433	@cindex nondeterministic parsing
	8434
	8435	Bison produces @emph{deterministic} parsers that choose uniquely
	8436	when to reduce and which reduction to apply
	8437	based on a summary of the preceding input and on one extra token of lookahead.
	8438	As a result, normal Bison handles a proper subset of the family of
	8439	context-free languages.
	8440	Ambiguous grammars, since they have strings with more than one possible
	8441	sequence of reductions cannot have deterministic parsers in this sense.
	8442	The same is true of languages that require more than one symbol of
	8443	lookahead, since the parser lacks the information necessary to make a
	8444	decision at the point it must be made in a shift-reduce parser.
	8445	Finally, as previously mentioned (@pxref{Mysterious Conflicts}),
	8446	there are languages where Bison's default choice of how to
	8447	summarize the input seen so far loses necessary information.
	8448
	8449	When you use the @samp{%glr-parser} declaration in your grammar file,
	8450	Bison generates a parser that uses a different algorithm, called
	8451	Generalized LR (or GLR). A Bison GLR
	8452	parser uses the same basic
	8453	algorithm for parsing as an ordinary Bison parser, but behaves
	8454	differently in cases where there is a shift-reduce conflict that has not
	8455	been resolved by precedence rules (@pxref{Precedence}) or a
	8456	reduce-reduce conflict. When a GLR parser encounters such a
	8457	situation, it
	8458	effectively @emph{splits} into a several parsers, one for each possible
	8459	shift or reduction. These parsers then proceed as usual, consuming
	8460	tokens in lock-step. Some of the stacks may encounter other conflicts
	8461	and split further, with the result that instead of a sequence of states,
	8462	a Bison GLR parsing stack is what is in effect a tree of states.
	8463
	8464	In effect, each stack represents a guess as to what the proper parse
	8465	is. Additional input may indicate that a guess was wrong, in which case
	8466	the appropriate stack silently disappears. Otherwise, the semantics
	8467	actions generated in each stack are saved, rather than being executed
	8468	immediately. When a stack disappears, its saved semantic actions never
	8469	get executed. When a reduction causes two stacks to become equivalent,
	8470	their sets of semantic actions are both saved with the state that
	8471	results from the reduction. We say that two stacks are equivalent
	8472	when they both represent the same sequence of states,
	8473	and each pair of corresponding states represents a
	8474	grammar symbol that produces the same segment of the input token
	8475	stream.
	8476
	8477	Whenever the parser makes a transition from having multiple
	8478	states to having one, it reverts to the normal deterministic parsing
	8479	algorithm, after resolving and executing the saved-up actions.
	8480	At this transition, some of the states on the stack will have semantic
	8481	values that are sets (actually multisets) of possible actions. The
	8482	parser tries to pick one of the actions by first finding one whose rule
	8483	has the highest dynamic precedence, as set by the @samp{%dprec}
	8484	declaration. Otherwise, if the alternative actions are not ordered by
	8485	precedence, but there the same merging function is declared for both
	8486	rules by the @samp{%merge} declaration,
	8487	Bison resolves and evaluates both and then calls the merge function on
	8488	the result. Otherwise, it reports an ambiguity.
	8489
	8490	It is possible to use a data structure for the GLR parsing tree that
	8491	permits the processing of any LR(1) grammar in linear time (in the
	8492	size of the input), any unambiguous (not necessarily
	8493	LR(1)) grammar in
	8494	quadratic worst-case time, and any general (possibly ambiguous)
	8495	context-free grammar in cubic worst-case time. However, Bison currently
	8496	uses a simpler data structure that requires time proportional to the
	8497	length of the input times the maximum number of stacks required for any
	8498	prefix of the input. Thus, really ambiguous or nondeterministic
	8499	grammars can require exponential time and space to process. Such badly
	8500	behaving examples, however, are not generally of practical interest.
	8501	Usually, nondeterminism in a grammar is local---the parser is ``in
	8502	doubt'' only for a few tokens at a time. Therefore, the current data
	8503	structure should generally be adequate. On LR(1) portions of a
	8504	grammar, in particular, it is only slightly slower than with the
	8505	deterministic LR(1) Bison parser.
	8506
	8507	For a more detailed exposition of GLR parsers, @pxref{Bibliography,,Scott
	8508	2000}.
	8509
	8510	@node Memory Management
	8511	@section Memory Management, and How to Avoid Memory Exhaustion
	8512	@cindex memory exhaustion
	8513	@cindex memory management
	8514	@cindex stack overflow
	8515	@cindex parser stack overflow
	8516	@cindex overflow of parser stack
	8517
	8518	The Bison parser stack can run out of memory if too many tokens are shifted and
	8519	not reduced. When this happens, the parser function @code{yyparse}
	8520	calls @code{yyerror} and then returns 2.
	8521
	8522	Because Bison parsers have growing stacks, hitting the upper limit
	8523	usually results from using a right recursion instead of a left
	8524	recursion, see @ref{Recursion, ,Recursive Rules}.
	8525
	8526	@vindex YYMAXDEPTH
	8527	By defining the macro @code{YYMAXDEPTH}, you can control how deep the
	8528	parser stack can become before memory is exhausted. Define the
	8529	macro with a value that is an integer. This value is the maximum number
	8530	of tokens that can be shifted (and not reduced) before overflow.
	8531
	8532	The stack space allowed is not necessarily allocated. If you specify a
	8533	large value for @code{YYMAXDEPTH}, the parser normally allocates a small
	8534	stack at first, and then makes it bigger by stages as needed. This
	8535	increasing allocation happens automatically and silently. Therefore,
	8536	you do not need to make @code{YYMAXDEPTH} painfully small merely to save
	8537	space for ordinary inputs that do not need much stack.
	8538
	8539	However, do not allow @code{YYMAXDEPTH} to be a value so large that
	8540	arithmetic overflow could occur when calculating the size of the stack
	8541	space. Also, do not allow @code{YYMAXDEPTH} to be less than
	8542	@code{YYINITDEPTH}.
	8543
	8544	@cindex default stack limit
	8545	The default value of @code{YYMAXDEPTH}, if you do not define it, is
	8546	10000.
	8547
	8548	@vindex YYINITDEPTH
	8549	You can control how much stack is allocated initially by defining the
	8550	macro @code{YYINITDEPTH} to a positive integer. For the deterministic
	8551	parser in C, this value must be a compile-time constant
	8552	unless you are assuming C99 or some other target language or compiler
	8553	that allows variable-length arrays. The default is 200.
	8554
	8555	Do not allow @code{YYINITDEPTH} to be greater than @code{YYMAXDEPTH}.
	8556
	8557	You can generate a deterministic parser containing C++ user code from
	8558	the default (C) skeleton, as well as from the C++ skeleton
	8559	(@pxref{C++ Parsers}). However, if you do use the default skeleton
	8560	and want to allow the parsing stack to grow,
	8561	be careful not to use semantic types or location types that require
	8562	non-trivial copy constructors.
	8563	The C skeleton bypasses these constructors when copying data to
	8564	new, larger stacks.
	8565
	8566	@node Error Recovery
	8567	@chapter Error Recovery
	8568	@cindex error recovery
	8569	@cindex recovery from errors
	8570
	8571	It is not usually acceptable to have a program terminate on a syntax
	8572	error. For example, a compiler should recover sufficiently to parse the
	8573	rest of the input file and check it for errors; a calculator should accept
	8574	another expression.
	8575
	8576	In a simple interactive command parser where each input is one line, it may
	8577	be sufficient to allow @code{yyparse} to return 1 on error and have the
	8578	caller ignore the rest of the input line when that happens (and then call
	8579	@code{yyparse} again). But this is inadequate for a compiler, because it
	8580	forgets all the syntactic context leading up to the error. A syntax error
	8581	deep within a function in the compiler input should not cause the compiler
	8582	to treat the following line like the beginning of a source file.
	8583
	8584	@findex error
	8585	You can define how to recover from a syntax error by writing rules to
	8586	recognize the special token @code{error}. This is a terminal symbol that
	8587	is always defined (you need not declare it) and reserved for error
	8588	handling. The Bison parser generates an @code{error} token whenever a
	8589	syntax error happens; if you have provided a rule to recognize this token
	8590	in the current context, the parse can continue.
	8591
	8592	For example:
	8593
	8594	@example
	8595	stmts:
	8596	%empty
	8597	\| stmts '\n'
	8598	\| stmts exp '\n'
	8599	\| stmts error '\n'
	8600	@end example
	8601
	8602	The fourth rule in this example says that an error followed by a newline
	8603	makes a valid addition to any @code{stmts}.
	8604
	8605	What happens if a syntax error occurs in the middle of an @code{exp}? The
	8606	error recovery rule, interpreted strictly, applies to the precise sequence
	8607	of a @code{stmts}, an @code{error} and a newline. If an error occurs in
	8608	the middle of an @code{exp}, there will probably be some additional tokens
	8609	and subexpressions on the stack after the last @code{stmts}, and there
	8610	will be tokens to read before the next newline. So the rule is not
	8611	applicable in the ordinary way.
	8612
	8613	But Bison can force the situation to fit the rule, by discarding part of
	8614	the semantic context and part of the input. First it discards states
	8615	and objects from the stack until it gets back to a state in which the
	8616	@code{error} token is acceptable. (This means that the subexpressions
	8617	already parsed are discarded, back to the last complete @code{stmts}.)
	8618	At this point the @code{error} token can be shifted. Then, if the old
	8619	lookahead token is not acceptable to be shifted next, the parser reads
	8620	tokens and discards them until it finds a token which is acceptable. In
	8621	this example, Bison reads and discards input until the next newline so
	8622	that the fourth rule can apply. Note that discarded symbols are
	8623	possible sources of memory leaks, see @ref{Destructor Decl, , Freeing
	8624	Discarded Symbols}, for a means to reclaim this memory.
	8625
	8626	The choice of error rules in the grammar is a choice of strategies for
	8627	error recovery. A simple and useful strategy is simply to skip the rest of
	8628	the current input line or current statement if an error is detected:
	8629
	8630	@example
	8631	stmt: error ';' /* On error, skip until ';' is read. */
	8632	@end example
	8633
	8634	It is also useful to recover to the matching close-delimiter of an
	8635	opening-delimiter that has already been parsed. Otherwise the
	8636	close-delimiter will probably appear to be unmatched, and generate another,
	8637	spurious error message:
	8638
	8639	@example
	8640	primary:
	8641	'(' expr ')'
	8642	\| '(' error ')'
	8643	@dots{}
	8644	;
	8645	@end example
	8646
	8647	Error recovery strategies are necessarily guesses. When they guess wrong,
	8648	one syntax error often leads to another. In the above example, the error
	8649	recovery rule guesses that an error is due to bad input within one
	8650	@code{stmt}. Suppose that instead a spurious semicolon is inserted in the
	8651	middle of a valid @code{stmt}. After the error recovery rule recovers
	8652	from the first error, another syntax error will be found straightaway,
	8653	since the text following the spurious semicolon is also an invalid
	8654	@code{stmt}.
	8655
	8656	To prevent an outpouring of error messages, the parser will output no error
	8657	message for another syntax error that happens shortly after the first; only
	8658	after three consecutive input tokens have been successfully shifted will
	8659	error messages resume.
	8660
	8661	Note that rules which accept the @code{error} token may have actions, just
	8662	as any other rules can.
	8663
	8664	@findex yyerrok
	8665	You can make error messages resume immediately by using the macro
	8666	@code{yyerrok} in an action. If you do this in the error rule's action, no
	8667	error messages will be suppressed. This macro requires no arguments;
	8668	@samp{yyerrok;} is a valid C statement.
	8669
	8670	@findex yyclearin
	8671	The previous lookahead token is reanalyzed immediately after an error. If
	8672	this is unacceptable, then the macro @code{yyclearin} may be used to clear
	8673	this token. Write the statement @samp{yyclearin;} in the error rule's
	8674	action.
	8675	@xref{Action Features, ,Special Features for Use in Actions}.
	8676
	8677	For example, suppose that on a syntax error, an error handling routine is
	8678	called that advances the input stream to some point where parsing should
	8679	once again commence. The next symbol returned by the lexical scanner is
	8680	probably correct. The previous lookahead token ought to be discarded
	8681	with @samp{yyclearin;}.
	8682
	8683	@vindex YYRECOVERING
	8684	The expression @code{YYRECOVERING ()} yields 1 when the parser
	8685	is recovering from a syntax error, and 0 otherwise.
	8686	Syntax error diagnostics are suppressed while recovering from a syntax
	8687	error.
	8688
	8689	@node Context Dependency
	8690	@chapter Handling Context Dependencies
	8691
	8692	The Bison paradigm is to parse tokens first, then group them into larger
	8693	syntactic units. In many languages, the meaning of a token is affected by
	8694	its context. Although this violates the Bison paradigm, certain techniques
	8695	(known as @dfn{kludges}) may enable you to write Bison parsers for such
	8696	languages.
	8697
	8698	@menu
	8699	* Semantic Tokens:: Token parsing can depend on the semantic context.
	8700	* Lexical Tie-ins:: Token parsing can depend on the syntactic context.
	8701	* Tie-in Recovery:: Lexical tie-ins have implications for how
	8702	error recovery rules must be written.
	8703	@end menu
	8704
	8705	(Actually, ``kludge'' means any technique that gets its job done but is
	8706	neither clean nor robust.)
	8707
	8708	@node Semantic Tokens
	8709	@section Semantic Info in Token Types
	8710
	8711	The C language has a context dependency: the way an identifier is used
	8712	depends on what its current meaning is. For example, consider this:
	8713
	8714	@example
	8715	foo (x);
	8716	@end example
	8717
	8718	This looks like a function call statement, but if @code{foo} is a typedef
	8719	name, then this is actually a declaration of @code{x}. How can a Bison
	8720	parser for C decide how to parse this input?
	8721
	8722	The method used in GNU C is to have two different token types,
	8723	@code{IDENTIFIER} and @code{TYPENAME}. When @code{yylex} finds an
	8724	identifier, it looks up the current declaration of the identifier in order
	8725	to decide which token type to return: @code{TYPENAME} if the identifier is
	8726	declared as a typedef, @code{IDENTIFIER} otherwise.
	8727
	8728	The grammar rules can then express the context dependency by the choice of
	8729	token type to recognize. @code{IDENTIFIER} is accepted as an expression,
	8730	but @code{TYPENAME} is not. @code{TYPENAME} can start a declaration, but
	8731	@code{IDENTIFIER} cannot. In contexts where the meaning of the identifier
	8732	is @emph{not} significant, such as in declarations that can shadow a
	8733	typedef name, either @code{TYPENAME} or @code{IDENTIFIER} is
	8734	accepted---there is one rule for each of the two token types.
	8735
	8736	This technique is simple to use if the decision of which kinds of
	8737	identifiers to allow is made at a place close to where the identifier is
	8738	parsed. But in C this is not always so: C allows a declaration to
	8739	redeclare a typedef name provided an explicit type has been specified
	8740	earlier:
	8741
	8742	@example
	8743	typedef int foo, bar;
	8744	int baz (void)
	8745	@group
	8746	@{
	8747	static bar (bar); /* @r{redeclare @code{bar} as static variable} */
	8748	extern foo foo (foo); /* @r{redeclare @code{foo} as function} */
	8749	return foo (bar);
	8750	@}
	8751	@end group
	8752	@end example
	8753
	8754	Unfortunately, the name being declared is separated from the declaration
	8755	construct itself by a complicated syntactic structure---the ``declarator''.
	8756
	8757	As a result, part of the Bison parser for C needs to be duplicated, with
	8758	all the nonterminal names changed: once for parsing a declaration in
	8759	which a typedef name can be redefined, and once for parsing a
	8760	declaration in which that can't be done. Here is a part of the
	8761	duplication, with actions omitted for brevity:
	8762
	8763	@example
	8764	@group
	8765	initdcl:
	8766	declarator maybeasm '=' init
	8767	\| declarator maybeasm
	8768	;
	8769	@end group
	8770
	8771	@group
	8772	notype_initdcl:
	8773	notype_declarator maybeasm '=' init
	8774	\| notype_declarator maybeasm
	8775	;
	8776	@end group
	8777	@end example
	8778
	8779	@noindent
	8780	Here @code{initdcl} can redeclare a typedef name, but @code{notype_initdcl}
	8781	cannot. The distinction between @code{declarator} and
	8782	@code{notype_declarator} is the same sort of thing.
	8783
	8784	There is some similarity between this technique and a lexical tie-in
	8785	(described next), in that information which alters the lexical analysis is
	8786	changed during parsing by other parts of the program. The difference is
	8787	here the information is global, and is used for other purposes in the
	8788	program. A true lexical tie-in has a special-purpose flag controlled by
	8789	the syntactic context.
	8790
	8791	@node Lexical Tie-ins
	8792	@section Lexical Tie-ins
	8793	@cindex lexical tie-in
	8794
	8795	One way to handle context-dependency is the @dfn{lexical tie-in}: a flag
	8796	which is set by Bison actions, whose purpose is to alter the way tokens are
	8797	parsed.
	8798
	8799	For example, suppose we have a language vaguely like C, but with a special
	8800	construct @samp{hex (@var{hex-expr})}. After the keyword @code{hex} comes
	8801	an expression in parentheses in which all integers are hexadecimal. In
	8802	particular, the token @samp{a1b} must be treated as an integer rather than
	8803	as an identifier if it appears in that context. Here is how you can do it:
	8804
	8805	@example
	8806	@group
	8807	%@{
	8808	int hexflag;
	8809	int yylex (void);
	8810	void yyerror (char const *);
	8811	%@}
	8812	%%
	8813	@dots{}
	8814	@end group
	8815	@group
	8816	expr:
	8817	IDENTIFIER
	8818	\| constant
	8819	\| HEX '(' @{ hexflag = 1; @}
	8820	expr ')' @{ hexflag = 0; $$ = $4; @}
	8821	\| expr '+' expr @{ $$ = make_sum ($1, $3); @}
	8822	@dots{}
	8823	;
	8824	@end group
	8825
	8826	@group
	8827	constant:
	8828	INTEGER
	8829	\| STRING
	8830	;
	8831	@end group
	8832	@end example
	8833
	8834	@noindent
	8835	Here we assume that @code{yylex} looks at the value of @code{hexflag}; when
	8836	it is nonzero, all integers are parsed in hexadecimal, and tokens starting
	8837	with letters are parsed as integers if possible.
	8838
	8839	The declaration of @code{hexflag} shown in the prologue of the grammar
	8840	file is needed to make it accessible to the actions (@pxref{Prologue,
	8841	,The Prologue}). You must also write the code in @code{yylex} to obey
	8842	the flag.
	8843
	8844	@node Tie-in Recovery
	8845	@section Lexical Tie-ins and Error Recovery
	8846
	8847	Lexical tie-ins make strict demands on any error recovery rules you have.
	8848	@xref{Error Recovery}.
	8849
	8850	The reason for this is that the purpose of an error recovery rule is to
	8851	abort the parsing of one construct and resume in some larger construct.
	8852	For example, in C-like languages, a typical error recovery rule is to skip
	8853	tokens until the next semicolon, and then start a new statement, like this:
	8854
	8855	@example
	8856	stmt:
	8857	expr ';'
	8858	\| IF '(' expr ')' stmt @{ @dots{} @}
	8859	@dots{}
	8860	\| error ';' @{ hexflag = 0; @}
	8861	;
	8862	@end example
	8863
	8864	If there is a syntax error in the middle of a @samp{hex (@var{expr})}
	8865	construct, this error rule will apply, and then the action for the
	8866	completed @samp{hex (@var{expr})} will never run. So @code{hexflag} would
	8867	remain set for the entire rest of the input, or until the next @code{hex}
	8868	keyword, causing identifiers to be misinterpreted as integers.
	8869
	8870	To avoid this problem the error recovery rule itself clears @code{hexflag}.
	8871
	8872	There may also be an error recovery rule that works within expressions.
	8873	For example, there could be a rule which applies within parentheses
	8874	and skips to the close-parenthesis:
	8875
	8876	@example
	8877	@group
	8878	expr:
	8879	@dots{}
	8880	\| '(' expr ')' @{ $$ = $2; @}
	8881	\| '(' error ')'
	8882	@dots{}
	8883	@end group
	8884	@end example
	8885
	8886	If this rule acts within the @code{hex} construct, it is not going to abort
	8887	that construct (since it applies to an inner level of parentheses within
	8888	the construct). Therefore, it should not clear the flag: the rest of
	8889	the @code{hex} construct should be parsed with the flag still in effect.
	8890
	8891	What if there is an error recovery rule which might abort out of the
	8892	@code{hex} construct or might not, depending on circumstances? There is no
	8893	way you can write the action to determine whether a @code{hex} construct is
	8894	being aborted or not. So if you are using a lexical tie-in, you had better
	8895	make sure your error recovery rules are not of this kind. Each rule must
	8896	be such that you can be sure that it always will, or always won't, have to
	8897	clear the flag.
	8898
	8899	@c ================================================== Debugging Your Parser
	8900
	8901	@node Debugging
	8902	@chapter Debugging Your Parser
	8903
	8904	Developing a parser can be a challenge, especially if you don't understand
	8905	the algorithm (@pxref{Algorithm, ,The Bison Parser Algorithm}). This
	8906	chapter explains how understand and debug a parser.
	8907
	8908	The first sections focus on the static part of the parser: its structure.
	8909	They explain how to generate and read the detailed description of the
	8910	automaton. There are several formats available:
	8911	@itemize @minus
	8912	@item
	8913	as text, see @ref{Understanding, , Understanding Your Parser};
	8914
	8915	@item
	8916	as a graph, see @ref{Graphviz,, Visualizing Your Parser};
	8917
	8918	@item
	8919	or as a markup report that can be turned, for instance, into HTML, see
	8920	@ref{Xml,, Visualizing your parser in multiple formats}.
	8921	@end itemize
	8922
	8923	The last section focuses on the dynamic part of the parser: how to enable
	8924	and understand the parser run-time traces (@pxref{Tracing, ,Tracing Your
	8925	Parser}).
	8926
	8927	@menu
	8928	* Understanding:: Understanding the structure of your parser.
	8929	* Graphviz:: Getting a visual representation of the parser.
	8930	* Xml:: Getting a markup representation of the parser.
	8931	* Tracing:: Tracing the execution of your parser.
	8932	@end menu
	8933
	8934	@node Understanding
	8935	@section Understanding Your Parser
	8936
	8937	As documented elsewhere (@pxref{Algorithm, ,The Bison Parser Algorithm})
	8938	Bison parsers are @dfn{shift/reduce automata}. In some cases (much more
	8939	frequent than one would hope), looking at this automaton is required to
	8940	tune or simply fix a parser.
	8941
	8942	The textual file is generated when the options @option{--report} or
	8943	@option{--verbose} are specified, see @ref{Invocation, , Invoking
	8944	Bison}. Its name is made by removing @samp{.tab.c} or @samp{.c} from
	8945	the parser implementation file name, and adding @samp{.output}
	8946	instead. Therefore, if the grammar file is @file{foo.y}, then the
	8947	parser implementation file is called @file{foo.tab.c} by default. As
	8948	a consequence, the verbose output file is called @file{foo.output}.
	8949
	8950	The following grammar file, @file{calc.y}, will be used in the sequel:
	8951
	8952	@example
	8953	%token NUM STR
	8954	@group
	8955	%left '+' '-'
	8956	%left '*'
	8957	@end group
	8958	%%
	8959	@group
	8960	exp:
	8961	exp '+' exp
	8962	\| exp '-' exp
	8963	\| exp '*' exp
	8964	\| exp '/' exp
	8965	\| NUM
	8966	;
	8967	@end group
	8968	useless: STR;
	8969	%%
	8970	@end example
	8971
	8972	@command{bison} reports:
	8973
	8974	@example
	8975	calc.y: warning: 1 nonterminal useless in grammar
	8976	calc.y: warning: 1 rule useless in grammar
	8977	calc.y:12.1-7: warning: nonterminal useless in grammar: useless
	8978	calc.y:12.10-12: warning: rule useless in grammar: useless: STR
	8979	calc.y: conflicts: 7 shift/reduce
	8980	@end example
	8981
	8982	When given @option{--report=state}, in addition to @file{calc.tab.c}, it
	8983	creates a file @file{calc.output} with contents detailed below. The
	8984	order of the output and the exact presentation might vary, but the
	8985	interpretation is the same.
	8986
	8987	@noindent
	8988	@cindex token, useless
	8989	@cindex useless token
	8990	@cindex nonterminal, useless
	8991	@cindex useless nonterminal
	8992	@cindex rule, useless
	8993	@cindex useless rule
	8994	The first section reports useless tokens, nonterminals and rules. Useless
	8995	nonterminals and rules are removed in order to produce a smaller parser, but
	8996	useless tokens are preserved, since they might be used by the scanner (note
	8997	the difference between ``useless'' and ``unused'' below):
	8998
	8999	@example
	9000	Nonterminals useless in grammar
	9001	useless
	9002
	9003	Terminals unused in grammar
	9004	STR
	9005
	9006	Rules useless in grammar
	9007	6 useless: STR
	9008	@end example
	9009
	9010	@noindent
	9011	The next section lists states that still have conflicts.
	9012
	9013	@example
	9014	State 8 conflicts: 1 shift/reduce
	9015	State 9 conflicts: 1 shift/reduce
	9016	State 10 conflicts: 1 shift/reduce
	9017	State 11 conflicts: 4 shift/reduce
	9018	@end example
	9019
	9020	@noindent
	9021	Then Bison reproduces the exact grammar it used:
	9022
	9023	@example
	9024	Grammar
	9025
	9026	0 $accept: exp $end
	9027
	9028	1 exp: exp '+' exp
	9029	2 \| exp '-' exp
	9030	3 \| exp '*' exp
	9031	4 \| exp '/' exp
	9032	5 \| NUM
	9033	@end example
	9034
	9035	@noindent
	9036	and reports the uses of the symbols:
	9037
	9038	@example
	9039	@group
	9040	Terminals, with rules where they appear
	9041
	9042	$end (0) 0
	9043	'*' (42) 3
	9044	'+' (43) 1
	9045	'-' (45) 2
	9046	'/' (47) 4
	9047	error (256)
	9048	NUM (258) 5
	9049	STR (259)
	9050	@end group
	9051
	9052	@group
	9053	Nonterminals, with rules where they appear
	9054
	9055	$accept (9)
	9056	on left: 0
	9057	exp (10)
	9058	on left: 1 2 3 4 5, on right: 0 1 2 3 4
	9059	@end group
	9060	@end example
	9061
	9062	@noindent
	9063	@cindex item
	9064	@cindex pointed rule
	9065	@cindex rule, pointed
	9066	Bison then proceeds onto the automaton itself, describing each state
	9067	with its set of @dfn{items}, also known as @dfn{pointed rules}. Each
	9068	item is a production rule together with a point (@samp{.}) marking
	9069	the location of the input cursor.
	9070
	9071	@example
	9072	State 0
	9073
	9074	0 $accept: . exp $end
	9075
	9076	NUM shift, and go to state 1
	9077
	9078	exp go to state 2
	9079	@end example
	9080
	9081	This reads as follows: ``state 0 corresponds to being at the very
	9082	beginning of the parsing, in the initial rule, right before the start
	9083	symbol (here, @code{exp}). When the parser returns to this state right
	9084	after having reduced a rule that produced an @code{exp}, the control
	9085	flow jumps to state 2. If there is no such transition on a nonterminal
	9086	symbol, and the lookahead is a @code{NUM}, then this token is shifted onto
	9087	the parse stack, and the control flow jumps to state 1. Any other
	9088	lookahead triggers a syntax error.''
	9089
	9090	@cindex core, item set
	9091	@cindex item set core
	9092	@cindex kernel, item set
	9093	@cindex item set core
	9094	Even though the only active rule in state 0 seems to be rule 0, the
	9095	report lists @code{NUM} as a lookahead token because @code{NUM} can be
	9096	at the beginning of any rule deriving an @code{exp}. By default Bison
	9097	reports the so-called @dfn{core} or @dfn{kernel} of the item set, but if
	9098	you want to see more detail you can invoke @command{bison} with
	9099	@option{--report=itemset} to list the derived items as well:
	9100
	9101	@example
	9102	State 0
	9103
	9104	0 $accept: . exp $end
	9105	1 exp: . exp '+' exp
	9106	2 \| . exp '-' exp
	9107	3 \| . exp '*' exp
	9108	4 \| . exp '/' exp
	9109	5 \| . NUM
	9110
	9111	NUM shift, and go to state 1
	9112
	9113	exp go to state 2
	9114	@end example
	9115
	9116	@noindent
	9117	In the state 1@dots{}
	9118
	9119	@example
	9120	State 1
	9121
	9122	5 exp: NUM .
	9123
	9124	$default reduce using rule 5 (exp)
	9125	@end example
	9126
	9127	@noindent
	9128	the rule 5, @samp{exp: NUM;}, is completed. Whatever the lookahead token
	9129	(@samp{$default}), the parser will reduce it. If it was coming from
	9130	State 0, then, after this reduction it will return to state 0, and will
	9131	jump to state 2 (@samp{exp: go to state 2}).
	9132
	9133	@example
	9134	State 2
	9135
	9136	0 $accept: exp . $end
	9137	1 exp: exp . '+' exp
	9138	2 \| exp . '-' exp
	9139	3 \| exp . '*' exp
	9140	4 \| exp . '/' exp
	9141
	9142	$end shift, and go to state 3
	9143	'+' shift, and go to state 4
	9144	'-' shift, and go to state 5
	9145	'*' shift, and go to state 6
	9146	'/' shift, and go to state 7
	9147	@end example
	9148
	9149	@noindent
	9150	In state 2, the automaton can only shift a symbol. For instance,
	9151	because of the item @samp{exp: exp . '+' exp}, if the lookahead is
	9152	@samp{+} it is shifted onto the parse stack, and the automaton
	9153	jumps to state 4, corresponding to the item @samp{exp: exp '+' . exp}.
	9154	Since there is no default action, any lookahead not listed triggers a syntax
	9155	error.
	9156
	9157	@cindex accepting state
	9158	The state 3 is named the @dfn{final state}, or the @dfn{accepting
	9159	state}:
	9160
	9161	@example
	9162	State 3
	9163
	9164	0 $accept: exp $end .
	9165
	9166	$default accept
	9167	@end example
	9168
	9169	@noindent
	9170	the initial rule is completed (the start symbol and the end-of-input were
	9171	read), the parsing exits successfully.
	9172
	9173	The interpretation of states 4 to 7 is straightforward, and is left to
	9174	the reader.
	9175
	9176	@example
	9177	State 4
	9178
	9179	1 exp: exp '+' . exp
	9180
	9181	NUM shift, and go to state 1
	9182
	9183	exp go to state 8
	9184
	9185
	9186	State 5
	9187
	9188	2 exp: exp '-' . exp
	9189
	9190	NUM shift, and go to state 1
	9191
	9192	exp go to state 9
	9193
	9194
	9195	State 6
	9196
	9197	3 exp: exp '*' . exp
	9198
	9199	NUM shift, and go to state 1
	9200
	9201	exp go to state 10
	9202
	9203
	9204	State 7
	9205
	9206	4 exp: exp '/' . exp
	9207
	9208	NUM shift, and go to state 1
	9209
	9210	exp go to state 11
	9211	@end example
	9212
	9213	As was announced in beginning of the report, @samp{State 8 conflicts:
	9214	1 shift/reduce}:
	9215
	9216	@example
	9217	State 8
	9218
	9219	1 exp: exp . '+' exp
	9220	1 \| exp '+' exp .
	9221	2 \| exp . '-' exp
	9222	3 \| exp . '*' exp
	9223	4 \| exp . '/' exp
	9224
	9225	'*' shift, and go to state 6
	9226	'/' shift, and go to state 7
	9227
	9228	'/' [reduce using rule 1 (exp)]
	9229	$default reduce using rule 1 (exp)
	9230	@end example
	9231
	9232	Indeed, there are two actions associated to the lookahead @samp{/}:
	9233	either shifting (and going to state 7), or reducing rule 1. The
	9234	conflict means that either the grammar is ambiguous, or the parser lacks
	9235	information to make the right decision. Indeed the grammar is
	9236	ambiguous, as, since we did not specify the precedence of @samp{/}, the
	9237	sentence @samp{NUM + NUM / NUM} can be parsed as @samp{NUM + (NUM /
	9238	NUM)}, which corresponds to shifting @samp{/}, or as @samp{(NUM + NUM) /
	9239	NUM}, which corresponds to reducing rule 1.
	9240
	9241	Because in deterministic parsing a single decision can be made, Bison
	9242	arbitrarily chose to disable the reduction, see @ref{Shift/Reduce, ,
	9243	Shift/Reduce Conflicts}. Discarded actions are reported between
	9244	square brackets.
	9245
	9246	Note that all the previous states had a single possible action: either
	9247	shifting the next token and going to the corresponding state, or
	9248	reducing a single rule. In the other cases, i.e., when shifting
	9249	@emph{and} reducing is possible or when @emph{several} reductions are
	9250	possible, the lookahead is required to select the action. State 8 is
	9251	one such state: if the lookahead is @samp{*} or @samp{/} then the action
	9252	is shifting, otherwise the action is reducing rule 1. In other words,
	9253	the first two items, corresponding to rule 1, are not eligible when the
	9254	lookahead token is @samp{}, since we specified that @samp{} has higher
	9255	precedence than @samp{+}. More generally, some items are eligible only
	9256	with some set of possible lookahead tokens. When run with
	9257	@option{--report=lookahead}, Bison specifies these lookahead tokens:
	9258
	9259	@example
	9260	State 8
	9261
	9262	1 exp: exp . '+' exp
	9263	1 \| exp '+' exp . [$end, '+', '-', '/']
	9264	2 \| exp . '-' exp
	9265	3 \| exp . '*' exp
	9266	4 \| exp . '/' exp
	9267
	9268	'*' shift, and go to state 6
	9269	'/' shift, and go to state 7
	9270
	9271	'/' [reduce using rule 1 (exp)]
	9272	$default reduce using rule 1 (exp)
	9273	@end example
	9274
	9275	Note however that while @samp{NUM + NUM / NUM} is ambiguous (which results in
	9276	the conflicts on @samp{/}), @samp{NUM + NUM * NUM} is not: the conflict was
	9277	solved thanks to associativity and precedence directives. If invoked with
	9278	@option{--report=solved}, Bison includes information about the solved
	9279	conflicts in the report:
	9280
	9281	@example
	9282	Conflict between rule 1 and token '+' resolved as reduce (%left '+').
	9283	Conflict between rule 1 and token '-' resolved as reduce (%left '-').
	9284	Conflict between rule 1 and token '' resolved as shift ('+' < '').
	9285	@end example
	9286
	9287
	9288	The remaining states are similar:
	9289
	9290	@example
	9291	@group
	9292	State 9
	9293
	9294	1 exp: exp . '+' exp
	9295	2 \| exp . '-' exp
	9296	2 \| exp '-' exp .
	9297	3 \| exp . '*' exp
	9298	4 \| exp . '/' exp
	9299
	9300	'*' shift, and go to state 6
	9301	'/' shift, and go to state 7
	9302
	9303	'/' [reduce using rule 2 (exp)]
	9304	$default reduce using rule 2 (exp)
	9305	@end group
	9306
	9307	@group
	9308	State 10
	9309
	9310	1 exp: exp . '+' exp
	9311	2 \| exp . '-' exp
	9312	3 \| exp . '*' exp
	9313	3 \| exp '*' exp .
	9314	4 \| exp . '/' exp
	9315
	9316	'/' shift, and go to state 7
	9317
	9318	'/' [reduce using rule 3 (exp)]
	9319	$default reduce using rule 3 (exp)
	9320	@end group
	9321
	9322	@group
	9323	State 11
	9324
	9325	1 exp: exp . '+' exp
	9326	2 \| exp . '-' exp
	9327	3 \| exp . '*' exp
	9328	4 \| exp . '/' exp
	9329	4 \| exp '/' exp .
	9330
	9331	'+' shift, and go to state 4
	9332	'-' shift, and go to state 5
	9333	'*' shift, and go to state 6
	9334	'/' shift, and go to state 7
	9335
	9336	'+' [reduce using rule 4 (exp)]
	9337	'-' [reduce using rule 4 (exp)]
	9338	'*' [reduce using rule 4 (exp)]
	9339	'/' [reduce using rule 4 (exp)]
	9340	$default reduce using rule 4 (exp)
	9341	@end group
	9342	@end example
	9343
	9344	@noindent
	9345	Observe that state 11 contains conflicts not only due to the lack of
	9346	precedence of @samp{/} with respect to @samp{+}, @samp{-}, and @samp{*}, but
	9347	also because the associativity of @samp{/} is not specified.
	9348
	9349	Bison may also produce an HTML version of this output, via an XML file and
	9350	XSLT processing (@pxref{Xml,,Visualizing your parser in multiple formats}).
	9351
	9352	@c ================================================= Graphical Representation
	9353
	9354	@node Graphviz
	9355	@section Visualizing Your Parser
	9356	@cindex dot
	9357
	9358	As another means to gain better understanding of the shift/reduce
	9359	automaton corresponding to the Bison parser, a DOT file can be generated. Note
	9360	that debugging a real grammar with this is tedious at best, and impractical
	9361	most of the times, because the generated files are huge (the generation of
	9362	a PDF or PNG file from it will take very long, and more often than not it will
	9363	fail due to memory exhaustion). This option was rather designed for beginners,
	9364	to help them understand LR parsers.
	9365
	9366	This file is generated when the @option{--graph} option is specified
	9367	(@pxref{Invocation, , Invoking Bison}). Its name is made by removing
	9368	@samp{.tab.c} or @samp{.c} from the parser implementation file name, and
	9369	adding @samp{.dot} instead. If the grammar file is @file{foo.y}, the
	9370	Graphviz output file is called @file{foo.dot}. A DOT file may also be
	9371	produced via an XML file and XSLT processing (@pxref{Xml,,Visualizing your
	9372	parser in multiple formats}).
	9373
	9374
	9375	The following grammar file, @file{rr.y}, will be used in the sequel:
	9376
	9377	@example
	9378	%%
	9379	@group
	9380	exp: a ";" \| b ".";
	9381	a: "0";
	9382	b: "0";
	9383	@end group
	9384	@end example
	9385
	9386	The graphical output
	9387	@ifnotinfo
	9388	(see @ref{fig:graph})
	9389	@end ifnotinfo
	9390	is very similar to the textual one, and as such it is easier understood by
	9391	making direct comparisons between them. @xref{Debugging, , Debugging Your
	9392	Parser}, for a detailled analysis of the textual report.
	9393
	9394	@ifnotinfo
	9395	@float Figure,fig:graph
	9396	@image{figs/example, 430pt}
	9397	@caption{A graphical rendering of the parser.}
	9398	@end float
	9399	@end ifnotinfo
	9400
	9401	@subheading Graphical Representation of States
	9402
	9403	The items (pointed rules) for each state are grouped together in graph nodes.
	9404	Their numbering is the same as in the verbose file. See the following points,
	9405	about transitions, for examples
	9406
	9407	When invoked with @option{--report=lookaheads}, the lookahead tokens, when
	9408	needed, are shown next to the relevant rule between square brackets as a
	9409	comma separated list. This is the case in the figure for the representation of
	9410	reductions, below.
	9411
	9412	@sp 1
	9413
	9414	The transitions are represented as directed edges between the current and
	9415	the target states.
	9416
	9417	@subheading Graphical Representation of Shifts
	9418
	9419	Shifts are shown as solid arrows, labelled with the lookahead token for that
	9420	shift. The following describes a reduction in the @file{rr.output} file:
	9421
	9422	@example
	9423	@group
	9424	State 3
	9425
	9426	1 exp: a . ";"
	9427
	9428	";" shift, and go to state 6
	9429	@end group
	9430	@end example
	9431
	9432	A Graphviz rendering of this portion of the graph could be:
	9433
	9434	@center @image{figs/example-shift, 100pt}
	9435
	9436	@subheading Graphical Representation of Reductions
	9437
	9438	Reductions are shown as solid arrows, leading to a diamond-shaped node
	9439	bearing the number of the reduction rule. The arrow is labelled with the
	9440	appropriate comma separated lookahead tokens. If the reduction is the default
	9441	action for the given state, there is no such label.
	9442
	9443	This is how reductions are represented in the verbose file @file{rr.output}:
	9444	@example
	9445	State 1
	9446
	9447	3 a: "0" . [";"]
	9448	4 b: "0" . ["."]
	9449
	9450	"." reduce using rule 4 (b)
	9451	$default reduce using rule 3 (a)
	9452	@end example
	9453
	9454	A Graphviz rendering of this portion of the graph could be:
	9455
	9456	@center @image{figs/example-reduce, 120pt}
	9457
	9458	When unresolved conflicts are present, because in deterministic parsing
	9459	a single decision can be made, Bison can arbitrarily choose to disable a
	9460	reduction, see @ref{Shift/Reduce, , Shift/Reduce Conflicts}. Discarded actions
	9461	are distinguished by a red filling color on these nodes, just like how they are
	9462	reported between square brackets in the verbose file.
	9463
	9464	The reduction corresponding to the rule number 0 is the acceptation
	9465	state. It is shown as a blue diamond, labelled ``Acc''.
	9466
	9467	@subheading Graphical representation of go tos
	9468
	9469	The @samp{go to} jump transitions are represented as dotted lines bearing
	9470	the name of the rule being jumped to.
	9471
	9472	@c ================================================= XML
	9473
	9474	@node Xml
	9475	@section Visualizing your parser in multiple formats
	9476	@cindex xml
	9477
	9478	Bison supports two major report formats: textual output
	9479	(@pxref{Understanding, ,Understanding Your Parser}) when invoked
	9480	with option @option{--verbose}, and DOT
	9481	(@pxref{Graphviz,, Visualizing Your Parser}) when invoked with
	9482	option @option{--graph}. However,
	9483	another alternative is to output an XML file that may then be, with
	9484	@command{xsltproc}, rendered as either a raw text format equivalent to the
	9485	verbose file, or as an HTML version of the same file, with clickable
	9486	transitions, or even as a DOT. The @file{.output} and DOT files obtained via
	9487	XSLT have no difference whatsoever with those obtained by invoking
	9488	@command{bison} with options @option{--verbose} or @option{--graph}.
	9489
	9490	The XML file is generated when the options @option{-x} or
	9491	@option{--xml[=FILE]} are specified, see @ref{Invocation,,Invoking Bison}.
	9492	If not specified, its name is made by removing @samp{.tab.c} or @samp{.c}
	9493	from the parser implementation file name, and adding @samp{.xml} instead.
	9494	For instance, if the grammar file is @file{foo.y}, the default XML output
	9495	file is @file{foo.xml}.
	9496
	9497	Bison ships with a @file{data/xslt} directory, containing XSL Transformation
	9498	files to apply to the XML file. Their names are non-ambiguous:
	9499
	9500	@table @file
	9501	@item xml2dot.xsl
	9502	Used to output a copy of the DOT visualization of the automaton.
	9503	@item xml2text.xsl
	9504	Used to output a copy of the @samp{.output} file.
	9505	@item xml2xhtml.xsl
	9506	Used to output an xhtml enhancement of the @samp{.output} file.
	9507	@end table
	9508
	9509	Sample usage (requires @command{xsltproc}):
	9510	@example
	9511	$ bison -x gr.y
	9512	@group
	9513	$ bison --print-datadir
	9514	/usr/local/share/bison
	9515	@end group
	9516	$ xsltproc /usr/local/share/bison/xslt/xml2xhtml.xsl gr.xml >gr.html
	9517	@end example
	9518
	9519	@c ================================================= Tracing
	9520
	9521	@node Tracing
	9522	@section Tracing Your Parser
	9523	@findex yydebug
	9524	@cindex debugging
	9525	@cindex tracing the parser
	9526
	9527	When a Bison grammar compiles properly but parses ``incorrectly'', the
	9528	@code{yydebug} parser-trace feature helps figuring out why.
	9529
	9530	@menu
	9531	* Enabling Traces:: Activating run-time trace support
	9532	* Mfcalc Traces:: Extending @code{mfcalc} to support traces
	9533	* The YYPRINT Macro:: Obsolete interface for semantic value reports
	9534	@end menu
	9535
	9536	@node Enabling Traces
	9537	@subsection Enabling Traces
	9538	There are several means to enable compilation of trace facilities:
	9539
	9540	@table @asis
	9541	@item the macro @code{YYDEBUG}
	9542	@findex YYDEBUG
	9543	Define the macro @code{YYDEBUG} to a nonzero value when you compile the
	9544	parser. This is compliant with POSIX Yacc. You could use
	9545	@samp{-DYYDEBUG=1} as a compiler option or you could put @samp{#define
	9546	YYDEBUG 1} in the prologue of the grammar file (@pxref{Prologue, , The
	9547	Prologue}).
	9548
	9549	If the @code{%define} variable @code{api.prefix} is used (@pxref{Multiple
	9550	Parsers, ,Multiple Parsers in the Same Program}), for instance @samp{%define
	9551	api.prefix x}, then if @code{CDEBUG} is defined, its value controls the
	9552	tracing feature (enabled if and only if nonzero); otherwise tracing is
	9553	enabled if and only if @code{YYDEBUG} is nonzero.
	9554
	9555	@item the option @option{-t} (POSIX Yacc compliant)
	9556	@itemx the option @option{--debug} (Bison extension)
	9557	Use the @samp{-t} option when you run Bison (@pxref{Invocation, ,Invoking
	9558	Bison}). With @samp{%define api.prefix c}, it defines @code{CDEBUG} to 1,
	9559	otherwise it defines @code{YYDEBUG} to 1.
	9560
	9561	@item the directive @samp{%debug}
	9562	@findex %debug
	9563	Add the @code{%debug} directive (@pxref{Decl Summary, ,Bison Declaration
	9564	Summary}). This Bison extension is maintained for backward
	9565	compatibility with previous versions of Bison.
	9566
	9567	@item the variable @samp{parse.trace}
	9568	@findex %define parse.trace
	9569	Add the @samp{%define parse.trace} directive (@pxref{%define
	9570	Summary,,parse.trace}), or pass the @option{-Dparse.trace} option
	9571	(@pxref{Bison Options}). This is a Bison extension, which is especially
	9572	useful for languages that don't use a preprocessor. Unless POSIX and Yacc
	9573	portability matter to you, this is the preferred solution.
	9574	@end table
	9575
	9576	We suggest that you always enable the trace option so that debugging is
	9577	always possible.
	9578
	9579	@findex YYFPRINTF
	9580	The trace facility outputs messages with macro calls of the form
	9581	@code{YYFPRINTF (stderr, @var{format}, @var{args})} where
	9582	@var{format} and @var{args} are the usual @code{printf} format and variadic
	9583	arguments. If you define @code{YYDEBUG} to a nonzero value but do not
	9584	define @code{YYFPRINTF}, @code{<stdio.h>} is automatically included
	9585	and @code{YYFPRINTF} is defined to @code{fprintf}.
	9586
	9587	Once you have compiled the program with trace facilities, the way to
	9588	request a trace is to store a nonzero value in the variable @code{yydebug}.
	9589	You can do this by making the C code do it (in @code{main}, perhaps), or
	9590	you can alter the value with a C debugger.
	9591
	9592	Each step taken by the parser when @code{yydebug} is nonzero produces a
	9593	line or two of trace information, written on @code{stderr}. The trace
	9594	messages tell you these things:
	9595
	9596	@itemize @bullet
	9597	@item
	9598	Each time the parser calls @code{yylex}, what kind of token was read.
	9599
	9600	@item
	9601	Each time a token is shifted, the depth and complete contents of the
	9602	state stack (@pxref{Parser States}).
	9603
	9604	@item
	9605	Each time a rule is reduced, which rule it is, and the complete contents
	9606	of the state stack afterward.
	9607	@end itemize
	9608
	9609	To make sense of this information, it helps to refer to the automaton
	9610	description file (@pxref{Understanding, ,Understanding Your Parser}).
	9611	This file shows the meaning of each state in terms of
	9612	positions in various rules, and also what each state will do with each
	9613	possible input token. As you read the successive trace messages, you
	9614	can see that the parser is functioning according to its specification in
	9615	the listing file. Eventually you will arrive at the place where
	9616	something undesirable happens, and you will see which parts of the
	9617	grammar are to blame.
	9618
	9619	The parser implementation file is a C/C++/Java program and you can use
	9620	debuggers on it, but it's not easy to interpret what it is doing. The
	9621	parser function is a finite-state machine interpreter, and aside from
	9622	the actions it executes the same code over and over. Only the values
	9623	of variables show where in the grammar it is working.
	9624
	9625	@node Mfcalc Traces
	9626	@subsection Enabling Debug Traces for @code{mfcalc}
	9627
	9628	The debugging information normally gives the token type of each token read,
	9629	but not its semantic value. The @code{%printer} directive allows specify
	9630	how semantic values are reported, see @ref{Printer Decl, , Printing
	9631	Semantic Values}. For backward compatibility, Yacc like C parsers may also
	9632	use the @code{YYPRINT} (@pxref{The YYPRINT Macro, , The @code{YYPRINT}
	9633	Macro}), but its use is discouraged.
	9634
	9635	As a demonstration of @code{%printer}, consider the multi-function
	9636	calculator, @code{mfcalc} (@pxref{Multi-function Calc}). To enable run-time
	9637	traces, and semantic value reports, insert the following directives in its
	9638	prologue:
	9639
	9640	@comment file: mfcalc.y: 2
	9641	@example
	9642	/* Generate the parser description file. */
	9643	%verbose
	9644	/* Enable run-time traces (yydebug). */
	9645	%define parse.trace
	9646
	9647	/* Formatting semantic values. */
	9648	%printer @{ fprintf (yyoutput, "%s", $$->name); @} VAR;
	9649	%printer @{ fprintf (yyoutput, "%s()", $$->name); @} FNCT;
	9650	%printer @{ fprintf (yyoutput, "%g", $$); @} <double>;
	9651	@end example
	9652
	9653	The @code{%define} directive instructs Bison to generate run-time trace
	9654	support. Then, activation of these traces is controlled at run-time by the
	9655	@code{yydebug} variable, which is disabled by default. Because these traces
	9656	will refer to the ``states'' of the parser, it is helpful to ask for the
	9657	creation of a description of that parser; this is the purpose of (admittedly
	9658	ill-named) @code{%verbose} directive.
	9659
	9660	The set of @code{%printer} directives demonstrates how to format the
	9661	semantic value in the traces. Note that the specification can be done
	9662	either on the symbol type (e.g., @code{VAR} or @code{FNCT}), or on the type
	9663	tag: since @code{<double>} is the type for both @code{NUM} and @code{exp},
	9664	this printer will be used for them.
	9665
	9666	Here is a sample of the information provided by run-time traces. The traces
	9667	are sent onto standard error.
	9668
	9669	@example
	9670	$ @kbd{echo 'sin(1-1)' \| ./mfcalc -p}
	9671	Starting parse
	9672	Entering state 0
	9673	Reducing stack by rule 1 (line 34):
	9674	-> $$ = nterm input ()
	9675	Stack now 0
	9676	Entering state 1
	9677	@end example
	9678
	9679	@noindent
	9680	This first batch shows a specific feature of this grammar: the first rule
	9681	(which is in line 34 of @file{mfcalc.y} can be reduced without even having
	9682	to look for the first token. The resulting left-hand symbol (@code{$$}) is
	9683	a valueless (@samp{()}) @code{input} non terminal (@code{nterm}).
	9684
	9685	Then the parser calls the scanner.
	9686	@example
	9687	Reading a token: Next token is token FNCT (sin())
	9688	Shifting token FNCT (sin())
	9689	Entering state 6
	9690	@end example
	9691
	9692	@noindent
	9693	That token (@code{token}) is a function (@code{FNCT}) whose value is
	9694	@samp{sin} as formatted per our @code{%printer} specification: @samp{sin()}.
	9695	The parser stores (@code{Shifting}) that token, and others, until it can do
	9696	something about it.
	9697
	9698	@example
	9699	Reading a token: Next token is token '(' ()
	9700	Shifting token '(' ()
	9701	Entering state 14
	9702	Reading a token: Next token is token NUM (1.000000)
	9703	Shifting token NUM (1.000000)
	9704	Entering state 4
	9705	Reducing stack by rule 6 (line 44):
	9706	$1 = token NUM (1.000000)
	9707	-> $$ = nterm exp (1.000000)
	9708	Stack now 0 1 6 14
	9709	Entering state 24
	9710	@end example
	9711
	9712	@noindent
	9713	The previous reduction demonstrates the @code{%printer} directive for
	9714	@code{<double>}: both the token @code{NUM} and the resulting nonterminal
	9715	@code{exp} have @samp{1} as value.
	9716
	9717	@example
	9718	Reading a token: Next token is token '-' ()
	9719	Shifting token '-' ()
	9720	Entering state 17
	9721	Reading a token: Next token is token NUM (1.000000)
	9722	Shifting token NUM (1.000000)
	9723	Entering state 4
	9724	Reducing stack by rule 6 (line 44):
	9725	$1 = token NUM (1.000000)
	9726	-> $$ = nterm exp (1.000000)
	9727	Stack now 0 1 6 14 24 17
	9728	Entering state 26
	9729	Reading a token: Next token is token ')' ()
	9730	Reducing stack by rule 11 (line 49):
	9731	$1 = nterm exp (1.000000)
	9732	$2 = token '-' ()
	9733	$3 = nterm exp (1.000000)
	9734	-> $$ = nterm exp (0.000000)
	9735	Stack now 0 1 6 14
	9736	Entering state 24
	9737	@end example
	9738
	9739	@noindent
	9740	The rule for the subtraction was just reduced. The parser is about to
	9741	discover the end of the call to @code{sin}.
	9742
	9743	@example
	9744	Next token is token ')' ()
	9745	Shifting token ')' ()
	9746	Entering state 31
	9747	Reducing stack by rule 9 (line 47):
	9748	$1 = token FNCT (sin())
	9749	$2 = token '(' ()
	9750	$3 = nterm exp (0.000000)
	9751	$4 = token ')' ()
	9752	-> $$ = nterm exp (0.000000)
	9753	Stack now 0 1
	9754	Entering state 11
	9755	@end example
	9756
	9757	@noindent
	9758	Finally, the end-of-line allow the parser to complete the computation, and
	9759	display its result.
	9760
	9761	@example
	9762	Reading a token: Next token is token '\n' ()
	9763	Shifting token '\n' ()
	9764	Entering state 22
	9765	Reducing stack by rule 4 (line 40):
	9766	$1 = nterm exp (0.000000)
	9767	$2 = token '\n' ()
	9768	@result{} 0
	9769	-> $$ = nterm line ()
	9770	Stack now 0 1
	9771	Entering state 10
	9772	Reducing stack by rule 2 (line 35):
	9773	$1 = nterm input ()
	9774	$2 = nterm line ()
	9775	-> $$ = nterm input ()
	9776	Stack now 0
	9777	Entering state 1
	9778	@end example
	9779
	9780	The parser has returned into state 1, in which it is waiting for the next
	9781	expression to evaluate, or for the end-of-file token, which causes the
	9782	completion of the parsing.
	9783
	9784	@example
	9785	Reading a token: Now at end of input.
	9786	Shifting token $end ()
	9787	Entering state 2
	9788	Stack now 0 1 2
	9789	Cleanup: popping token $end ()
	9790	Cleanup: popping nterm input ()
	9791	@end example
	9792
	9793
	9794	@node The YYPRINT Macro
	9795	@subsection The @code{YYPRINT} Macro
	9796
	9797	@findex YYPRINT
	9798	Before @code{%printer} support, semantic values could be displayed using the
	9799	@code{YYPRINT} macro, which works only for terminal symbols and only with
	9800	the @file{yacc.c} skeleton.
	9801
	9802	@deffn {Macro} YYPRINT (@var{stream}, @var{token}, @var{value});
	9803	@findex YYPRINT
	9804	If you define @code{YYPRINT}, it should take three arguments. The parser
	9805	will pass a standard I/O stream, the numeric code for the token type, and
	9806	the token value (from @code{yylval}).
	9807
	9808	For @file{yacc.c} only. Obsoleted by @code{%printer}.
	9809	@end deffn
	9810
	9811	Here is an example of @code{YYPRINT} suitable for the multi-function
	9812	calculator (@pxref{Mfcalc Declarations, ,Declarations for @code{mfcalc}}):
	9813
	9814	@example
	9815	%@{
	9816	static void print_token_value (FILE *, int, YYSTYPE);
	9817	#define YYPRINT(File, Type, Value) \
	9818	print_token_value (File, Type, Value)
	9819	%@}
	9820
	9821	@dots{} %% @dots{} %% @dots{}
	9822
	9823	static void
	9824	print_token_value (FILE *file, int type, YYSTYPE value)
	9825	@{
	9826	if (type == VAR)
	9827	fprintf (file, "%s", value.tptr->name);
	9828	else if (type == NUM)
	9829	fprintf (file, "%d", value.val);
	9830	@}
	9831	@end example
	9832
	9833	@c ================================================= Invoking Bison
	9834
	9835	@node Invocation
	9836	@chapter Invoking Bison
	9837	@cindex invoking Bison
	9838	@cindex Bison invocation
	9839	@cindex options for invoking Bison
	9840
	9841	The usual way to invoke Bison is as follows:
	9842
	9843	@example
	9844	bison @var{infile}
	9845	@end example
	9846
	9847	Here @var{infile} is the grammar file name, which usually ends in
	9848	@samp{.y}. The parser implementation file's name is made by replacing
	9849	the @samp{.y} with @samp{.tab.c} and removing any leading directory.
	9850	Thus, the @samp{bison foo.y} file name yields @file{foo.tab.c}, and
	9851	the @samp{bison hack/foo.y} file name yields @file{foo.tab.c}. It's
	9852	also possible, in case you are writing C++ code instead of C in your
	9853	grammar file, to name it @file{foo.ypp} or @file{foo.y++}. Then, the
	9854	output files will take an extension like the given one as input
	9855	(respectively @file{foo.tab.cpp} and @file{foo.tab.c++}). This
	9856	feature takes effect with all options that manipulate file names like
	9857	@samp{-o} or @samp{-d}.
	9858
	9859	For example :
	9860
	9861	@example
	9862	bison -d @var{infile.yxx}
	9863	@end example
	9864	@noindent
	9865	will produce @file{infile.tab.cxx} and @file{infile.tab.hxx}, and
	9866
	9867	@example
	9868	bison -d -o @var{output.c++} @var{infile.y}
	9869	@end example
	9870	@noindent
	9871	will produce @file{output.c++} and @file{outfile.h++}.
	9872
	9873	For compatibility with POSIX, the standard Bison
	9874	distribution also contains a shell script called @command{yacc} that
	9875	invokes Bison with the @option{-y} option.
	9876
	9877	@menu
	9878	* Bison Options:: All the options described in detail,
	9879	in alphabetical order by short options.
	9880	* Option Cross Key:: Alphabetical list of long options.
	9881	* Yacc Library:: Yacc-compatible @code{yylex} and @code{main}.
	9882	@end menu
	9883
	9884	@node Bison Options
	9885	@section Bison Options
	9886
	9887	Bison supports both traditional single-letter options and mnemonic long
	9888	option names. Long option names are indicated with @samp{--} instead of
	9889	@samp{-}. Abbreviations for option names are allowed as long as they
	9890	are unique. When a long option takes an argument, like
	9891	@samp{--file-prefix}, connect the option name and the argument with
	9892	@samp{=}.
	9893
	9894	Here is a list of options that can be used with Bison, alphabetized by
	9895	short option. It is followed by a cross key alphabetized by long
	9896	option.
	9897
	9898	@c Please, keep this ordered as in 'bison --help'.
	9899	@noindent
	9900	Operations modes:
	9901	@table @option
	9902	@item -h
	9903	@itemx --help
	9904	Print a summary of the command-line options to Bison and exit.
	9905
	9906	@item -V
	9907	@itemx --version
	9908	Print the version number of Bison and exit.
	9909
	9910	@item --print-localedir
	9911	Print the name of the directory containing locale-dependent data.
	9912
	9913	@item --print-datadir
	9914	Print the name of the directory containing skeletons and XSLT.
	9915
	9916	@item -y
	9917	@itemx --yacc
	9918	Act more like the traditional Yacc command. This can cause different
	9919	diagnostics to be generated, and may change behavior in other minor
	9920	ways. Most importantly, imitate Yacc's output file name conventions,
	9921	so that the parser implementation file is called @file{y.tab.c}, and
	9922	the other outputs are called @file{y.output} and @file{y.tab.h}.
	9923	Also, if generating a deterministic parser in C, generate
	9924	@code{#define} statements in addition to an @code{enum} to associate
	9925	token numbers with token names. Thus, the following shell script can
	9926	substitute for Yacc, and the Bison distribution contains such a script
	9927	for compatibility with POSIX:
	9928
	9929	@example
	9930	#! /bin/sh
	9931	bison -y "$@@"
	9932	@end example
	9933
	9934	The @option{-y}/@option{--yacc} option is intended for use with
	9935	traditional Yacc grammars. If your grammar uses a Bison extension
	9936	like @samp{%glr-parser}, Bison might not be Yacc-compatible even if
	9937	this option is specified.
	9938
	9939	@item -W [@var{category}]
	9940	@itemx --warnings[=@var{category}]
	9941	Output warnings falling in @var{category}. @var{category} can be one
	9942	of:
	9943	@table @code
	9944	@item midrule-values
	9945	Warn about mid-rule values that are set but not used within any of the actions
	9946	of the parent rule.
	9947	For example, warn about unused @code{$2} in:
	9948
	9949	@example
	9950	exp: '1' @{ $$ = 1; @} '+' exp @{ $$ = $1 + $4; @};
	9951	@end example
	9952
	9953	Also warn about mid-rule values that are used but not set.
	9954	For example, warn about unset @code{$$} in the mid-rule action in:
	9955
	9956	@example
	9957	exp: '1' @{ $1 = 1; @} '+' exp @{ $$ = $2 + $4; @};
	9958	@end example
	9959
	9960	These warnings are not enabled by default since they sometimes prove to
	9961	be false alarms in existing grammars employing the Yacc constructs
	9962	@code{$0} or @code{$-@var{n}} (where @var{n} is some positive integer).
	9963
	9964	@item yacc
	9965	Incompatibilities with POSIX Yacc.
	9966
	9967	@item conflicts-sr
	9968	@itemx conflicts-rr
	9969	S/R and R/R conflicts. These warnings are enabled by default. However, if
	9970	the @code{%expect} or @code{%expect-rr} directive is specified, an
	9971	unexpected number of conflicts is an error, and an expected number of
	9972	conflicts is not reported, so @option{-W} and @option{--warning} then have
	9973	no effect on the conflict report.
	9974
	9975	@item deprecated
	9976	Deprecated constructs whose support will be removed in future versions of
	9977	Bison.
	9978
	9979	@item empty-rule
	9980	Empty rules without @code{%empty}. @xref{Empty Rules}. Disabled by
	9981	default, but enabled by uses of @code{%empty}, unless
	9982	@option{-Wno-empty-rule} was specified.
	9983
	9984	@item precedence
	9985	Useless precedence and associativity directives. Disabled by default.
	9986
	9987	Consider for instance the following grammar:
	9988
	9989	@example
	9990	@group
	9991	%nonassoc "="
	9992	%left "+"
	9993	%left "*"
	9994	%precedence "("
	9995	@end group
	9996	%%
	9997	@group
	9998	stmt:
	9999	exp
	10000	\| "var" "=" exp
	10001	;
	10002	@end group
	10003
	10004	@group
	10005	exp:
	10006	exp "+" exp
	10007	\| exp "*" "num"
	10008	\| "(" exp ")"
	10009	\| "num"
	10010	;
	10011	@end group
	10012	@end example
	10013
	10014	Bison reports:
	10015
	10016	@c cannot leave the location and the [-Wprecedence] for lack of
	10017	@c width in PDF.
	10018	@example
	10019	@group
	10020	warning: useless precedence and associativity for "="
	10021	%nonassoc "="
	10022	^^^
	10023	@end group
	10024	@group
	10025	warning: useless associativity for "*", use %precedence
	10026	%left "*"
	10027	^^^
	10028	@end group
	10029	@group
	10030	warning: useless precedence for "("
	10031	%precedence "("
	10032	^^^
	10033	@end group
	10034	@end example
	10035
	10036	One would get the exact same parser with the following directives instead:
	10037
	10038	@example
	10039	@group
	10040	%left "+"
	10041	%precedence "*"
	10042	@end group
	10043	@end example
	10044
	10045	@item other
	10046	All warnings not categorized above. These warnings are enabled by default.
	10047
	10048	This category is provided merely for the sake of completeness. Future
	10049	releases of Bison may move warnings from this category to new, more specific
	10050	categories.
	10051
	10052	@item all
	10053	All the warnings except @code{yacc}.
	10054
	10055	@item none
	10056	Turn off all the warnings.
	10057
	10058	@item error
	10059	See @option{-Werror}, below.
	10060	@end table
	10061
	10062	A category can be turned off by prefixing its name with @samp{no-}. For
	10063	instance, @option{-Wno-yacc} will hide the warnings about
	10064	POSIX Yacc incompatibilities.
	10065
	10066	@item -Werror[=@var{category}]
	10067	@itemx -Wno-error[=@var{category}]
	10068	Enable warnings falling in @var{category}, and treat them as errors. If no
	10069	@var{category} is given, it defaults to making all enabled warnings into errors.
	10070
	10071	@var{category} is the same as for @option{--warnings}, with the exception that
	10072	it may not be prefixed with @samp{no-} (see above).
	10073
	10074	Prefixed with @samp{no}, it deactivates the error treatment for this
	10075	@var{category}. However, the warning itself won't be disabled, or enabled, by
	10076	this option.
	10077
	10078	Note that the precedence of the @samp{=} and @samp{,} operators is such that
	10079	the following commands are @emph{not} equivalent, as the first will not treat
	10080	S/R conflicts as errors.
	10081
	10082	@example
	10083	$ bison -Werror=yacc,conflicts-sr input.y
	10084	$ bison -Werror=yacc,error=conflicts-sr input.y
	10085	@end example
	10086
	10087	@item -f [@var{feature}]
	10088	@itemx --feature[=@var{feature}]
	10089	Activate miscellaneous @var{feature}. @var{feature} can be one of:
	10090	@table @code
	10091	@item caret
	10092	@itemx diagnostics-show-caret
	10093	Show caret errors, in a manner similar to GCC's
	10094	@option{-fdiagnostics-show-caret}, or Clang's @option{-fcaret-diagnotics}. The
	10095	location provided with the message is used to quote the corresponding line of
	10096	the source file, underlining the important part of it with carets (^). Here is
	10097	an example, using the following file @file{in.y}:
	10098
	10099	@example
	10100	%type <ival> exp
	10101	%%
	10102	exp: exp '+' exp @{ $exp = $1 + $2; @};
	10103	@end example
	10104
	10105	When invoked with @option{-fcaret} (or nothing), Bison will report:
	10106
	10107	@example
	10108	@group
	10109	in.y:3.20-23: error: ambiguous reference: '$exp'
	10110	exp: exp '+' exp @{ $exp = $1 + $2; @};
	10111	^^^^
	10112	@end group
	10113	@group
	10114	in.y:3.1-3: refers to: $exp at $$
	10115	exp: exp '+' exp @{ $exp = $1 + $2; @};
	10116	^^^
	10117	@end group
	10118	@group
	10119	in.y:3.6-8: refers to: $exp at $1
	10120	exp: exp '+' exp @{ $exp = $1 + $2; @};
	10121	^^^
	10122	@end group
	10123	@group
	10124	in.y:3.14-16: refers to: $exp at $3
	10125	exp: exp '+' exp @{ $exp = $1 + $2; @};
	10126	^^^
	10127	@end group
	10128	@group
	10129	in.y:3.32-33: error: $2 of 'exp' has no declared type
	10130	exp: exp '+' exp @{ $exp = $1 + $2; @};
	10131	^^
	10132	@end group
	10133	@end example
	10134
	10135	Whereas, when invoked with @option{-fno-caret}, Bison will only report:
	10136
	10137	@example
	10138	@group
	10139	in.y:3.20-23: error: ambiguous reference: ‘$exp’
	10140	in.y:3.1-3: refers to: $exp at $$
	10141	in.y:3.6-8: refers to: $exp at $1
	10142	in.y:3.14-16: refers to: $exp at $3
	10143	in.y:3.32-33: error: $2 of ‘exp’ has no declared type
	10144	@end group
	10145	@end example
	10146
	10147	This option is activated by default.
	10148
	10149	@end table
	10150	@end table
	10151
	10152	@noindent
	10153	Tuning the parser:
	10154
	10155	@table @option
	10156	@item -t
	10157	@itemx --debug
	10158	In the parser implementation file, define the macro @code{YYDEBUG} to
	10159	1 if it is not already defined, so that the debugging facilities are
	10160	compiled. @xref{Tracing, ,Tracing Your Parser}.
	10161
	10162	@item -D @var{name}[=@var{value}]
	10163	@itemx --define=@var{name}[=@var{value}]
	10164	@itemx -F @var{name}[=@var{value}]
	10165	@itemx --force-define=@var{name}[=@var{value}]
	10166	Each of these is equivalent to @samp{%define @var{name} "@var{value}"}
	10167	(@pxref{%define Summary}) except that Bison processes multiple
	10168	definitions for the same @var{name} as follows:
	10169
	10170	@itemize
	10171	@item
	10172	Bison quietly ignores all command-line definitions for @var{name} except
	10173	the last.
	10174	@item
	10175	If that command-line definition is specified by a @code{-D} or
	10176	@code{--define}, Bison reports an error for any @code{%define}
	10177	definition for @var{name}.
	10178	@item
	10179	If that command-line definition is specified by a @code{-F} or
	10180	@code{--force-define} instead, Bison quietly ignores all @code{%define}
	10181	definitions for @var{name}.
	10182	@item
	10183	Otherwise, Bison reports an error if there are multiple @code{%define}
	10184	definitions for @var{name}.
	10185	@end itemize
	10186
	10187	You should avoid using @code{-F} and @code{--force-define} in your
	10188	make files unless you are confident that it is safe to quietly ignore
	10189	any conflicting @code{%define} that may be added to the grammar file.
	10190
	10191	@item -L @var{language}
	10192	@itemx --language=@var{language}
	10193	Specify the programming language for the generated parser, as if
	10194	@code{%language} was specified (@pxref{Decl Summary, , Bison Declaration
	10195	Summary}). Currently supported languages include C, C++, and Java.
	10196	@var{language} is case-insensitive.
	10197
	10198	@item --locations
	10199	Pretend that @code{%locations} was specified. @xref{Decl Summary}.
	10200
	10201	@item -p @var{prefix}
	10202	@itemx --name-prefix=@var{prefix}
	10203	Pretend that @code{%name-prefix "@var{prefix}"} was specified (@pxref{Decl
	10204	Summary}). Obsoleted by @code{-Dapi.prefix=@var{prefix}}. @xref{Multiple
	10205	Parsers, ,Multiple Parsers in the Same Program}.
	10206
	10207	@item -l
	10208	@itemx --no-lines
	10209	Don't put any @code{#line} preprocessor commands in the parser
	10210	implementation file. Ordinarily Bison puts them in the parser
	10211	implementation file so that the C compiler and debuggers will
	10212	associate errors with your source file, the grammar file. This option
	10213	causes them to associate errors with the parser implementation file,
	10214	treating it as an independent source file in its own right.
	10215
	10216	@item -S @var{file}
	10217	@itemx --skeleton=@var{file}
	10218	Specify the skeleton to use, similar to @code{%skeleton}
	10219	(@pxref{Decl Summary, , Bison Declaration Summary}).
	10220
	10221	@c You probably don't need this option unless you are developing Bison.
	10222	@c You should use @option{--language} if you want to specify the skeleton for a
	10223	@c different language, because it is clearer and because it will always
	10224	@c choose the correct skeleton for non-deterministic or push parsers.
	10225
	10226	If @var{file} does not contain a @code{/}, @var{file} is the name of a skeleton
	10227	file in the Bison installation directory.
	10228	If it does, @var{file} is an absolute file name or a file name relative to the
	10229	current working directory.
	10230	This is similar to how most shells resolve commands.
	10231
	10232	@item -k
	10233	@itemx --token-table
	10234	Pretend that @code{%token-table} was specified. @xref{Decl Summary}.
	10235	@end table
	10236
	10237	@noindent
	10238	Adjust the output:
	10239
	10240	@table @option
	10241	@item --defines[=@var{file}]
	10242	Pretend that @code{%defines} was specified, i.e., write an extra output
	10243	file containing macro definitions for the token type names defined in
	10244	the grammar, as well as a few other declarations. @xref{Decl Summary}.
	10245
	10246	@item -d
	10247	This is the same as @code{--defines} except @code{-d} does not accept a
	10248	@var{file} argument since POSIX Yacc requires that @code{-d} can be bundled
	10249	with other short options.
	10250
	10251	@item -b @var{file-prefix}
	10252	@itemx --file-prefix=@var{prefix}
	10253	Pretend that @code{%file-prefix} was specified, i.e., specify prefix to use
	10254	for all Bison output file names. @xref{Decl Summary}.
	10255
	10256	@item -r @var{things}
	10257	@itemx --report=@var{things}
	10258	Write an extra output file containing verbose description of the comma
	10259	separated list of @var{things} among:
	10260
	10261	@table @code
	10262	@item state
	10263	Description of the grammar, conflicts (resolved and unresolved), and
	10264	parser's automaton.
	10265
	10266	@item itemset
	10267	Implies @code{state} and augments the description of the automaton with
	10268	the full set of items for each state, instead of its core only.
	10269
	10270	@item lookahead
	10271	Implies @code{state} and augments the description of the automaton with
	10272	each rule's lookahead set.
	10273
	10274	@item solved
	10275	Implies @code{state}. Explain how conflicts were solved thanks to
	10276	precedence and associativity directives.
	10277
	10278	@item all
	10279	Enable all the items.
	10280
	10281	@item none
	10282	Do not generate the report.
	10283	@end table
	10284
	10285	@item --report-file=@var{file}
	10286	Specify the @var{file} for the verbose description.
	10287
	10288	@item -v
	10289	@itemx --verbose
	10290	Pretend that @code{%verbose} was specified, i.e., write an extra output
	10291	file containing verbose descriptions of the grammar and
	10292	parser. @xref{Decl Summary}.
	10293
	10294	@item -o @var{file}
	10295	@itemx --output=@var{file}
	10296	Specify the @var{file} for the parser implementation file.
	10297
	10298	The other output files' names are constructed from @var{file} as
	10299	described under the @samp{-v} and @samp{-d} options.
	10300
	10301	@item -g [@var{file}]
	10302	@itemx --graph[=@var{file}]
	10303	Output a graphical representation of the parser's
	10304	automaton computed by Bison, in @uref{http://www.graphviz.org/, Graphviz}
	10305	@uref{http://www.graphviz.org/doc/info/lang.html, DOT} format.
	10306	@code{@var{file}} is optional.
	10307	If omitted and the grammar file is @file{foo.y}, the output file will be
	10308	@file{foo.dot}.
	10309
	10310	@item -x [@var{file}]
	10311	@itemx --xml[=@var{file}]
	10312	Output an XML report of the parser's automaton computed by Bison.
	10313	@code{@var{file}} is optional.
	10314	If omitted and the grammar file is @file{foo.y}, the output file will be
	10315	@file{foo.xml}.
	10316	(The current XML schema is experimental and may evolve.
	10317	More user feedback will help to stabilize it.)
	10318	@end table
	10319
	10320	@node Option Cross Key
	10321	@section Option Cross Key
	10322
	10323	Here is a list of options, alphabetized by long option, to help you find
	10324	the corresponding short option and directive.
	10325
	10326	@multitable {@option{--force-define=@var{name}[=@var{value}]}} {@option{-F @var{name}[=@var{value}]}} {@code{%nondeterministic-parser}}
	10327	@headitem Long Option @tab Short Option @tab Bison Directive
	10328	@include cross-options.texi
	10329	@end multitable
	10330
	10331	@node Yacc Library
	10332	@section Yacc Library
	10333
	10334	The Yacc library contains default implementations of the
	10335	@code{yyerror} and @code{main} functions. These default
	10336	implementations are normally not useful, but POSIX requires
	10337	them. To use the Yacc library, link your program with the
	10338	@option{-ly} option. Note that Bison's implementation of the Yacc
	10339	library is distributed under the terms of the GNU General
	10340	Public License (@pxref{Copying}).
	10341
	10342	If you use the Yacc library's @code{yyerror} function, you should
	10343	declare @code{yyerror} as follows:
	10344
	10345	@example
	10346	int yyerror (char const *);
	10347	@end example
	10348
	10349	Bison ignores the @code{int} value returned by this @code{yyerror}.
	10350	If you use the Yacc library's @code{main} function, your
	10351	@code{yyparse} function should have the following type signature:
	10352
	10353	@example
	10354	int yyparse (void);
	10355	@end example
	10356
	10357	@c ================================================= C++ Bison
	10358
	10359	@node Other Languages
	10360	@chapter Parsers Written In Other Languages
	10361
	10362	@menu
	10363	* C++ Parsers:: The interface to generate C++ parser classes
	10364	* Java Parsers:: The interface to generate Java parser classes
	10365	@end menu
	10366
	10367	@node C++ Parsers
	10368	@section C++ Parsers
	10369
	10370	@menu
	10371	* C++ Bison Interface:: Asking for C++ parser generation
	10372	* C++ Semantic Values:: %union vs. C++
	10373	* C++ Location Values:: The position and location classes
	10374	* C++ Parser Interface:: Instantiating and running the parser
	10375	* C++ Scanner Interface:: Exchanges between yylex and parse
	10376	* A Complete C++ Example:: Demonstrating their use
	10377	@end menu
	10378
	10379	@node C++ Bison Interface
	10380	@subsection C++ Bison Interface
	10381	@c - %skeleton "lalr1.cc"
	10382	@c - Always pure
	10383	@c - initial action
	10384
	10385	The C++ deterministic parser is selected using the skeleton directive,
	10386	@samp{%skeleton "lalr1.cc"}, or the synonymous command-line option
	10387	@option{--skeleton=lalr1.cc}.
	10388	@xref{Decl Summary}.
	10389
	10390	When run, @command{bison} will create several entities in the @samp{yy}
	10391	namespace.
	10392	@findex %define api.namespace
	10393	Use the @samp{%define api.namespace} directive to change the namespace name,
	10394	see @ref{%define Summary,,api.namespace}. The various classes are generated
	10395	in the following files:
	10396
	10397	@table @file
	10398	@item position.hh
	10399	@itemx location.hh
	10400	The definition of the classes @code{position} and @code{location}, used for
	10401	location tracking when enabled. These files are not generated if the
	10402	@code{%define} variable @code{api.location.type} is defined. @xref{C++
	10403	Location Values}.
	10404
	10405	@item stack.hh
	10406	An auxiliary class @code{stack} used by the parser.
	10407
	10408	@item @var{file}.hh
	10409	@itemx @var{file}.cc
	10410	(Assuming the extension of the grammar file was @samp{.yy}.) The
	10411	declaration and implementation of the C++ parser class. The basename
	10412	and extension of these two files follow the same rules as with regular C
	10413	parsers (@pxref{Invocation}).
	10414
	10415	The header is @emph{mandatory}; you must either pass
	10416	@option{-d}/@option{--defines} to @command{bison}, or use the
	10417	@samp{%defines} directive.
	10418	@end table
	10419
	10420	All these files are documented using Doxygen; run @command{doxygen}
	10421	for a complete and accurate documentation.
	10422
	10423	@node C++ Semantic Values
	10424	@subsection C++ Semantic Values
	10425	@c - No objects in unions
	10426	@c - YYSTYPE
	10427	@c - Printer and destructor
	10428
	10429	Bison supports two different means to handle semantic values in C++. One is
	10430	alike the C interface, and relies on unions (@pxref{C++ Unions}). As C++
	10431	practitioners know, unions are inconvenient in C++, therefore another
	10432	approach is provided, based on variants (@pxref{C++ Variants}).
	10433
	10434	@menu
	10435	* C++ Unions:: Semantic values cannot be objects
	10436	* C++ Variants:: Using objects as semantic values
	10437	@end menu
	10438
	10439	@node C++ Unions
	10440	@subsubsection C++ Unions
	10441
	10442	The @code{%union} directive works as for C, see @ref{Union Decl, ,The
	10443	Union Declaration}. In particular it produces a genuine
	10444	@code{union}, which have a few specific features in C++.
	10445	@itemize @minus
	10446	@item
	10447	The type @code{YYSTYPE} is defined but its use is discouraged: rather
	10448	you should refer to the parser's encapsulated type
	10449	@code{yy::parser::semantic_type}.
	10450	@item
	10451	Non POD (Plain Old Data) types cannot be used. C++ forbids any
	10452	instance of classes with constructors in unions: only @emph{pointers}
	10453	to such objects are allowed.
	10454	@end itemize
	10455
	10456	Because objects have to be stored via pointers, memory is not
	10457	reclaimed automatically: using the @code{%destructor} directive is the
	10458	only means to avoid leaks. @xref{Destructor Decl, , Freeing Discarded
	10459	Symbols}.
	10460
	10461	@node C++ Variants
	10462	@subsubsection C++ Variants
	10463
	10464	Bison provides a @emph{variant} based implementation of semantic values for
	10465	C++. This alleviates all the limitations reported in the previous section,
	10466	and in particular, object types can be used without pointers.
	10467
	10468	To enable variant-based semantic values, set @code{%define} variable
	10469	@code{variant} (@pxref{%define Summary,, variant}). Once this defined,
	10470	@code{%union} is ignored, and instead of using the name of the fields of the
	10471	@code{%union} to ``type'' the symbols, use genuine types.
	10472
	10473	For instance, instead of
	10474
	10475	@example
	10476	%union
	10477	@{
	10478	int ival;
	10479	std::string* sval;
	10480	@}
	10481	%token <ival> NUMBER;
	10482	%token <sval> STRING;
	10483	@end example
	10484
	10485	@noindent
	10486	write
	10487
	10488	@example
	10489	%token <int> NUMBER;
	10490	%token <std::string> STRING;
	10491	@end example
	10492
	10493	@code{STRING} is no longer a pointer, which should fairly simplify the user
	10494	actions in the grammar and in the scanner (in particular the memory
	10495	management).
	10496
	10497	Since C++ features destructors, and since it is customary to specialize
	10498	@code{operator<<} to support uniform printing of values, variants also
	10499	typically simplify Bison printers and destructors.
	10500
	10501	Variants are stricter than unions. When based on unions, you may play any
	10502	dirty game with @code{yylval}, say storing an @code{int}, reading a
	10503	@code{char*}, and then storing a @code{double} in it. This is no longer
	10504	possible with variants: they must be initialized, then assigned to, and
	10505	eventually, destroyed.
	10506
	10507	@deftypemethod {semantic_type} {T&} build<T> ()
	10508	Initialize, but leave empty. Returns the address where the actual value may
	10509	be stored. Requires that the variant was not initialized yet.
	10510	@end deftypemethod
	10511
	10512	@deftypemethod {semantic_type} {T&} build<T> (const T& @var{t})
	10513	Initialize, and copy-construct from @var{t}.
	10514	@end deftypemethod
	10515
	10516
	10517	@strong{Warning}: We do not use Boost.Variant, for two reasons. First, it
	10518	appeared unacceptable to require Boost on the user's machine (i.e., the
	10519	machine on which the generated parser will be compiled, not the machine on
	10520	which @command{bison} was run). Second, for each possible semantic value,
	10521	Boost.Variant not only stores the value, but also a tag specifying its
	10522	type. But the parser already ``knows'' the type of the semantic value, so
	10523	that would be duplicating the information.
	10524
	10525	Therefore we developed light-weight variants whose type tag is external (so
	10526	they are really like @code{unions} for C++ actually). But our code is much
	10527	less mature that Boost.Variant. So there is a number of limitations in
	10528	(the current implementation of) variants:
	10529	@itemize
	10530	@item
	10531	Alignment must be enforced: values should be aligned in memory according to
	10532	the most demanding type. Computing the smallest alignment possible requires
	10533	meta-programming techniques that are not currently implemented in Bison, and
	10534	therefore, since, as far as we know, @code{double} is the most demanding
	10535	type on all platforms, alignments are enforced for @code{double} whatever
	10536	types are actually used. This may waste space in some cases.
	10537
	10538	@item
	10539	There might be portability issues we are not aware of.
	10540	@end itemize
	10541
	10542	As far as we know, these limitations @emph{can} be alleviated. All it takes
	10543	is some time and/or some talented C++ hacker willing to contribute to Bison.
	10544
	10545	@node C++ Location Values
	10546	@subsection C++ Location Values
	10547	@c - %locations
	10548	@c - class Position
	10549	@c - class Location
	10550	@c - %define filename_type "const symbol::Symbol"
	10551
	10552	When the directive @code{%locations} is used, the C++ parser supports
	10553	location tracking, see @ref{Tracking Locations}.
	10554
	10555	By default, two auxiliary classes define a @code{position}, a single point
	10556	in a file, and a @code{location}, a range composed of a pair of
	10557	@code{position}s (possibly spanning several files). But if the
	10558	@code{%define} variable @code{api.location.type} is defined, then these
	10559	classes will not be generated, and the user defined type will be used.
	10560
	10561	@tindex uint
	10562	In this section @code{uint} is an abbreviation for @code{unsigned int}: in
	10563	genuine code only the latter is used.
	10564
	10565	@menu
	10566	* C++ position:: One point in the source file
	10567	* C++ location:: Two points in the source file
	10568	* User Defined Location Type:: Required interface for locations
	10569	@end menu
	10570
	10571	@node C++ position
	10572	@subsubsection C++ @code{position}
	10573
	10574	@deftypeop {Constructor} {position} {} position (std::string* @var{file} = 0, uint @var{line} = 1, uint @var{col} = 1)
	10575	Create a @code{position} denoting a given point. Note that @code{file} is
	10576	not reclaimed when the @code{position} is destroyed: memory managed must be
	10577	handled elsewhere.
	10578	@end deftypeop
	10579
	10580	@deftypemethod {position} {void} initialize (std::string* @var{file} = 0, uint @var{line} = 1, uint @var{col} = 1)
	10581	Reset the position to the given values.
	10582	@end deftypemethod
	10583
	10584	@deftypeivar {position} {std::string*} file
	10585	The name of the file. It will always be handled as a pointer, the
	10586	parser will never duplicate nor deallocate it. As an experimental
	10587	feature you may change it to @samp{@var{type}*} using @samp{%define
	10588	filename_type "@var{type}"}.
	10589	@end deftypeivar
	10590
	10591	@deftypeivar {position} {uint} line
	10592	The line, starting at 1.
	10593	@end deftypeivar
	10594
	10595	@deftypemethod {position} {void} lines (int @var{height} = 1)
	10596	If @var{height} is not null, advance by @var{height} lines, resetting the
	10597	column number. The resulting line number cannot be less than 1.
	10598	@end deftypemethod
	10599
	10600	@deftypeivar {position} {uint} column
	10601	The column, starting at 1.
	10602	@end deftypeivar
	10603
	10604	@deftypemethod {position} {void} columns (int @var{width} = 1)
	10605	Advance by @var{width} columns, without changing the line number. The
	10606	resulting column number cannot be less than 1.
	10607	@end deftypemethod
	10608
	10609	@deftypemethod {position} {position&} operator+= (int @var{width})
	10610	@deftypemethodx {position} {position} operator+ (int @var{width})
	10611	@deftypemethodx {position} {position&} operator-= (int @var{width})
	10612	@deftypemethodx {position} {position} operator- (int @var{width})
	10613	Various forms of syntactic sugar for @code{columns}.
	10614	@end deftypemethod
	10615
	10616	@deftypemethod {position} {bool} operator== (const position& @var{that})
	10617	@deftypemethodx {position} {bool} operator!= (const position& @var{that})
	10618	Whether @code{*this} and @code{that} denote equal/different positions.
	10619	@end deftypemethod
	10620
	10621	@deftypefun {std::ostream&} operator<< (std::ostream& @var{o}, const position& @var{p})
	10622	Report @var{p} on @var{o} like this:
	10623	@samp{@var{file}:@var{line}.@var{column}}, or
	10624	@samp{@var{line}.@var{column}} if @var{file} is null.
	10625	@end deftypefun
	10626
	10627	@node C++ location
	10628	@subsubsection C++ @code{location}
	10629
	10630	@deftypeop {Constructor} {location} {} location (const position& @var{begin}, const position& @var{end})
	10631	Create a @code{Location} from the endpoints of the range.
	10632	@end deftypeop
	10633
	10634	@deftypeop {Constructor} {location} {} location (const position& @var{pos} = position())
	10635	@deftypeopx {Constructor} {location} {} location (std::string* @var{file}, uint @var{line}, uint @var{col})
	10636	Create a @code{Location} denoting an empty range located at a given point.
	10637	@end deftypeop
	10638
	10639	@deftypemethod {location} {void} initialize (std::string* @var{file} = 0, uint @var{line} = 1, uint @var{col} = 1)
	10640	Reset the location to an empty range at the given values.
	10641	@end deftypemethod
	10642
	10643	@deftypeivar {location} {position} begin
	10644	@deftypeivarx {location} {position} end
	10645	The first, inclusive, position of the range, and the first beyond.
	10646	@end deftypeivar
	10647
	10648	@deftypemethod {location} {void} columns (int @var{width} = 1)
	10649	@deftypemethodx {location} {void} lines (int @var{height} = 1)
	10650	Forwarded to the @code{end} position.
	10651	@end deftypemethod
	10652
	10653	@deftypemethod {location} {location} operator+ (const location& @var{end})
	10654	@deftypemethodx {location} {location} operator+ (int @var{width})
	10655	@deftypemethodx {location} {location} operator+= (int @var{width})
	10656	@deftypemethodx {location} {location} operator- (int @var{width})
	10657	@deftypemethodx {location} {location} operator-= (int @var{width})
	10658	Various forms of syntactic sugar.
	10659	@end deftypemethod
	10660
	10661	@deftypemethod {location} {void} step ()
	10662	Move @code{begin} onto @code{end}.
	10663	@end deftypemethod
	10664
	10665	@deftypemethod {location} {bool} operator== (const location& @var{that})
	10666	@deftypemethodx {location} {bool} operator!= (const location& @var{that})
	10667	Whether @code{*this} and @code{that} denote equal/different ranges of
	10668	positions.
	10669	@end deftypemethod
	10670
	10671	@deftypefun {std::ostream&} operator<< (std::ostream& @var{o}, const location& @var{p})
	10672	Report @var{p} on @var{o}, taking care of special cases such as: no
	10673	@code{filename} defined, or equal filename/line or column.
	10674	@end deftypefun
	10675
	10676	@node User Defined Location Type
	10677	@subsubsection User Defined Location Type
	10678	@findex %define api.location.type
	10679
	10680	Instead of using the built-in types you may use the @code{%define} variable
	10681	@code{api.location.type} to specify your own type:
	10682
	10683	@example
	10684	%define api.location.type @var{LocationType}
	10685	@end example
	10686
	10687	The requirements over your @var{LocationType} are:
	10688	@itemize
	10689	@item
	10690	it must be copyable;
	10691
	10692	@item
	10693	in order to compute the (default) value of @code{@@$} in a reduction, the
	10694	parser basically runs
	10695	@example
	10696	@@$.begin = @@$1.begin;
	10697	@@$.end = @@$@var{N}.end; // The location of last right-hand side symbol.
	10698	@end example
	10699	@noindent
	10700	so there must be copyable @code{begin} and @code{end} members;
	10701
	10702	@item
	10703	alternatively you may redefine the computation of the default location, in
	10704	which case these members are not required (@pxref{Location Default Action});
	10705
	10706	@item
	10707	if traces are enabled, then there must exist an @samp{std::ostream&
	10708	operator<< (std::ostream& o, const @var{LocationType}& s)} function.
	10709	@end itemize
	10710
	10711	@sp 1
	10712
	10713	In programs with several C++ parsers, you may also use the @code{%define}
	10714	variable @code{api.location.type} to share a common set of built-in
	10715	definitions for @code{position} and @code{location}. For instance, one
	10716	parser @file{master/parser.yy} might use:
	10717
	10718	@example
	10719	%defines
	10720	%locations
	10721	%define namespace "master::"
	10722	@end example
	10723
	10724	@noindent
	10725	to generate the @file{master/position.hh} and @file{master/location.hh}
	10726	files, reused by other parsers as follows:
	10727
	10728	@example
	10729	%define api.location.type "master::location"
	10730	%code requires @{ #include <master/location.hh> @}
	10731	@end example
	10732
	10733	@node C++ Parser Interface
	10734	@subsection C++ Parser Interface
	10735	@c - define parser_class_name
	10736	@c - Ctor
	10737	@c - parse, error, set_debug_level, debug_level, set_debug_stream,
	10738	@c debug_stream.
	10739	@c - Reporting errors
	10740
	10741	The output files @file{@var{output}.hh} and @file{@var{output}.cc}
	10742	declare and define the parser class in the namespace @code{yy}. The
	10743	class name defaults to @code{parser}, but may be changed using
	10744	@samp{%define parser_class_name "@var{name}"}. The interface of
	10745	this class is detailed below. It can be extended using the
	10746	@code{%parse-param} feature: its semantics is slightly changed since
	10747	it describes an additional member of the parser class, and an
	10748	additional argument for its constructor.
	10749
	10750	@defcv {Type} {parser} {semantic_type}
	10751	@defcvx {Type} {parser} {location_type}
	10752	The types for semantic values and locations (if enabled).
	10753	@end defcv
	10754
	10755	@defcv {Type} {parser} {token}
	10756	A structure that contains (only) the @code{yytokentype} enumeration, which
	10757	defines the tokens. To refer to the token @code{FOO},
	10758	use @code{yy::parser::token::FOO}. The scanner can use
	10759	@samp{typedef yy::parser::token token;} to ``import'' the token enumeration
	10760	(@pxref{Calc++ Scanner}).
	10761	@end defcv
	10762
	10763	@defcv {Type} {parser} {syntax_error}
	10764	This class derives from @code{std::runtime_error}. Throw instances of it
	10765	from the scanner or from the user actions to raise parse errors. This is
	10766	equivalent with first
	10767	invoking @code{error} to report the location and message of the syntax
	10768	error, and then to invoke @code{YYERROR} to enter the error-recovery mode.
	10769	But contrary to @code{YYERROR} which can only be invoked from user actions
	10770	(i.e., written in the action itself), the exception can be thrown from
	10771	function invoked from the user action.
	10772	@end defcv
	10773
	10774	@deftypemethod {parser} {} parser (@var{type1} @var{arg1}, ...)
	10775	Build a new parser object. There are no arguments by default, unless
	10776	@samp{%parse-param @{@var{type1} @var{arg1}@}} was used.
	10777	@end deftypemethod
	10778
	10779	@deftypemethod {syntax_error} {} syntax_error (const location_type& @var{l}, const std::string& @var{m})
	10780	@deftypemethodx {syntax_error} {} syntax_error (const std::string& @var{m})
	10781	Instantiate a syntax-error exception.
	10782	@end deftypemethod
	10783
	10784	@deftypemethod {parser} {int} parse ()
	10785	Run the syntactic analysis, and return 0 on success, 1 otherwise.
	10786
	10787	@cindex exceptions
	10788	The whole function is wrapped in a @code{try}/@code{catch} block, so that
	10789	when an exception is thrown, the @code{%destructor}s are called to release
	10790	the lookahead symbol, and the symbols pushed on the stack.
	10791	@end deftypemethod
	10792
	10793	@deftypemethod {parser} {std::ostream&} debug_stream ()
	10794	@deftypemethodx {parser} {void} set_debug_stream (std::ostream& @var{o})
	10795	Get or set the stream used for tracing the parsing. It defaults to
	10796	@code{std::cerr}.
	10797	@end deftypemethod
	10798
	10799	@deftypemethod {parser} {debug_level_type} debug_level ()
	10800	@deftypemethodx {parser} {void} set_debug_level (debug_level @var{l})
	10801	Get or set the tracing level. Currently its value is either 0, no trace,
	10802	or nonzero, full tracing.
	10803	@end deftypemethod
	10804
	10805	@deftypemethod {parser} {void} error (const location_type& @var{l}, const std::string& @var{m})
	10806	@deftypemethodx {parser} {void} error (const std::string& @var{m})
	10807	The definition for this member function must be supplied by the user:
	10808	the parser uses it to report a parser error occurring at @var{l},
	10809	described by @var{m}. If location tracking is not enabled, the second
	10810	signature is used.
	10811	@end deftypemethod
	10812
	10813
	10814	@node C++ Scanner Interface
	10815	@subsection C++ Scanner Interface
	10816	@c - prefix for yylex.
	10817	@c - Pure interface to yylex
	10818	@c - %lex-param
	10819
	10820	The parser invokes the scanner by calling @code{yylex}. Contrary to C
	10821	parsers, C++ parsers are always pure: there is no point in using the
	10822	@samp{%define api.pure} directive. The actual interface with @code{yylex}
	10823	depends whether you use unions, or variants.
	10824
	10825	@menu
	10826	* Split Symbols:: Passing symbols as two/three components
	10827	* Complete Symbols:: Making symbols a whole
	10828	@end menu
	10829
	10830	@node Split Symbols
	10831	@subsubsection Split Symbols
	10832
	10833	The interface is as follows.
	10834
	10835	@deftypemethod {parser} {int} yylex (semantic_type* @var{yylval}, location_type* @var{yylloc}, @var{type1} @var{arg1}, ...)
	10836	@deftypemethodx {parser} {int} yylex (semantic_type* @var{yylval}, @var{type1} @var{arg1}, ...)
	10837	Return the next token. Its type is the return value, its semantic value and
	10838	location (if enabled) being @var{yylval} and @var{yylloc}. Invocations of
	10839	@samp{%lex-param @{@var{type1} @var{arg1}@}} yield additional arguments.
	10840	@end deftypemethod
	10841
	10842	Note that when using variants, the interface for @code{yylex} is the same,
	10843	but @code{yylval} is handled differently.
	10844
	10845	Regular union-based code in Lex scanner typically look like:
	10846
	10847	@example
	10848	[0-9]+ @{
	10849	yylval.ival = text_to_int (yytext);
	10850	return yy::parser::INTEGER;
	10851	@}
	10852	[a-z]+ @{
	10853	yylval.sval = new std::string (yytext);
	10854	return yy::parser::IDENTIFIER;
	10855	@}
	10856	@end example
	10857
	10858	Using variants, @code{yylval} is already constructed, but it is not
	10859	initialized. So the code would look like:
	10860
	10861	@example
	10862	[0-9]+ @{
	10863	yylval.build<int>() = text_to_int (yytext);
	10864	return yy::parser::INTEGER;
	10865	@}
	10866	[a-z]+ @{
	10867	yylval.build<std::string> = yytext;
	10868	return yy::parser::IDENTIFIER;
	10869	@}
	10870	@end example
	10871
	10872	@noindent
	10873	or
	10874
	10875	@example
	10876	[0-9]+ @{
	10877	yylval.build(text_to_int (yytext));
	10878	return yy::parser::INTEGER;
	10879	@}
	10880	[a-z]+ @{
	10881	yylval.build(yytext);
	10882	return yy::parser::IDENTIFIER;
	10883	@}
	10884	@end example
	10885
	10886
	10887	@node Complete Symbols
	10888	@subsubsection Complete Symbols
	10889
	10890	If you specified both @code{%define api.value.type variant} and
	10891	@code{%define api.token.constructor},
	10892	the @code{parser} class also defines the class @code{parser::symbol_type}
	10893	which defines a @emph{complete} symbol, aggregating its type (i.e., the
	10894	traditional value returned by @code{yylex}), its semantic value (i.e., the
	10895	value passed in @code{yylval}, and possibly its location (@code{yylloc}).
	10896
	10897	@deftypemethod {symbol_type} {} symbol_type (token_type @var{type}, const semantic_type& @var{value}, const location_type& @var{location})
	10898	Build a complete terminal symbol which token type is @var{type}, and which
	10899	semantic value is @var{value}. If location tracking is enabled, also pass
	10900	the @var{location}.
	10901	@end deftypemethod
	10902
	10903	This interface is low-level and should not be used for two reasons. First,
	10904	it is inconvenient, as you still have to build the semantic value, which is
	10905	a variant, and second, because consistency is not enforced: as with unions,
	10906	it is still possible to give an integer as semantic value for a string.
	10907
	10908	So for each token type, Bison generates named constructors as follows.
	10909
	10910	@deftypemethod {symbol_type} {} make_@var{token} (const @var{value_type}& @var{value}, const location_type& @var{location})
	10911	@deftypemethodx {symbol_type} {} make_@var{token} (const location_type& @var{location})
	10912	Build a complete terminal symbol for the token type @var{token} (not
	10913	including the @code{api.token.prefix}) whose possible semantic value is
	10914	@var{value} of adequate @var{value_type}. If location tracking is enabled,
	10915	also pass the @var{location}.
	10916	@end deftypemethod
	10917
	10918	For instance, given the following declarations:
	10919
	10920	@example
	10921	%define api.token.prefix @{TOK_@}
	10922	%token <std::string> IDENTIFIER;
	10923	%token <int> INTEGER;
	10924	%token COLON;
	10925	@end example
	10926
	10927	@noindent
	10928	Bison generates the following functions:
	10929
	10930	@example
	10931	symbol_type make_IDENTIFIER(const std::string& v,
	10932	const location_type& l);
	10933	symbol_type make_INTEGER(const int& v,
	10934	const location_type& loc);
	10935	symbol_type make_COLON(const location_type& loc);
	10936	@end example
	10937
	10938	@noindent
	10939	which should be used in a Lex-scanner as follows.
	10940
	10941	@example
	10942	[0-9]+ return yy::parser::make_INTEGER(text_to_int (yytext), loc);
	10943	[a-z]+ return yy::parser::make_IDENTIFIER(yytext, loc);
	10944	":" return yy::parser::make_COLON(loc);
	10945	@end example
	10946
	10947	Tokens that do not have an identifier are not accessible: you cannot simply
	10948	use characters such as @code{':'}, they must be declared with @code{%token}.
	10949
	10950	@node A Complete C++ Example
	10951	@subsection A Complete C++ Example
	10952
	10953	This section demonstrates the use of a C++ parser with a simple but
	10954	complete example. This example should be available on your system,
	10955	ready to compile, in the directory @dfn{.../bison/examples/calc++}. It
	10956	focuses on the use of Bison, therefore the design of the various C++
	10957	classes is very naive: no accessors, no encapsulation of members etc.
	10958	We will use a Lex scanner, and more precisely, a Flex scanner, to
	10959	demonstrate the various interactions. A hand-written scanner is
	10960	actually easier to interface with.
	10961
	10962	@menu
	10963	* Calc++ --- C++ Calculator:: The specifications
	10964	* Calc++ Parsing Driver:: An active parsing context
	10965	* Calc++ Parser:: A parser class
	10966	* Calc++ Scanner:: A pure C++ Flex scanner
	10967	* Calc++ Top Level:: Conducting the band
	10968	@end menu
	10969
	10970	@node Calc++ --- C++ Calculator
	10971	@subsubsection Calc++ --- C++ Calculator
	10972
	10973	Of course the grammar is dedicated to arithmetics, a single
	10974	expression, possibly preceded by variable assignments. An
	10975	environment containing possibly predefined variables such as
	10976	@code{one} and @code{two}, is exchanged with the parser. An example
	10977	of valid input follows.
	10978
	10979	@example
	10980	three := 3
	10981	seven := one + two * three
	10982	seven * seven
	10983	@end example
	10984
	10985	@node Calc++ Parsing Driver
	10986	@subsubsection Calc++ Parsing Driver
	10987	@c - An env
	10988	@c - A place to store error messages
	10989	@c - A place for the result
	10990
	10991	To support a pure interface with the parser (and the scanner) the
	10992	technique of the ``parsing context'' is convenient: a structure
	10993	containing all the data to exchange. Since, in addition to simply
	10994	launch the parsing, there are several auxiliary tasks to execute (open
	10995	the file for parsing, instantiate the parser etc.), we recommend
	10996	transforming the simple parsing context structure into a fully blown
	10997	@dfn{parsing driver} class.
	10998
	10999	The declaration of this driver class, @file{calc++-driver.hh}, is as
	11000	follows. The first part includes the CPP guard and imports the
	11001	required standard library components, and the declaration of the parser
	11002	class.
	11003
	11004	@comment file: calc++-driver.hh
	11005	@example
	11006	#ifndef CALCXX_DRIVER_HH
	11007	# define CALCXX_DRIVER_HH
	11008	# include <string>
	11009	# include <map>
	11010	# include "calc++-parser.hh"
	11011	@end example
	11012
	11013
	11014	@noindent
	11015	Then comes the declaration of the scanning function. Flex expects
	11016	the signature of @code{yylex} to be defined in the macro
	11017	@code{YY_DECL}, and the C++ parser expects it to be declared. We can
	11018	factor both as follows.
	11019
	11020	@comment file: calc++-driver.hh
	11021	@example
	11022	// Tell Flex the lexer's prototype ...
	11023	# define YY_DECL \
	11024	yy::calcxx_parser::symbol_type yylex (calcxx_driver& driver)
	11025	// ... and declare it for the parser's sake.
	11026	YY_DECL;
	11027	@end example
	11028
	11029	@noindent
	11030	The @code{calcxx_driver} class is then declared with its most obvious
	11031	members.
	11032
	11033	@comment file: calc++-driver.hh
	11034	@example
	11035	// Conducting the whole scanning and parsing of Calc++.
	11036	class calcxx_driver
	11037	@{
	11038	public:
	11039	calcxx_driver ();
	11040	virtual ~calcxx_driver ();
	11041
	11042	std::map<std::string, int> variables;
	11043
	11044	int result;
	11045	@end example
	11046
	11047	@noindent
	11048	To encapsulate the coordination with the Flex scanner, it is useful to have
	11049	member functions to open and close the scanning phase.
	11050
	11051	@comment file: calc++-driver.hh
	11052	@example
	11053	// Handling the scanner.
	11054	void scan_begin ();
	11055	void scan_end ();
	11056	bool trace_scanning;
	11057	@end example
	11058
	11059	@noindent
	11060	Similarly for the parser itself.
	11061
	11062	@comment file: calc++-driver.hh
	11063	@example
	11064	// Run the parser on file F.
	11065	// Return 0 on success.
	11066	int parse (const std::string& f);
	11067	// The name of the file being parsed.
	11068	// Used later to pass the file name to the location tracker.
	11069	std::string file;
	11070	// Whether parser traces should be generated.
	11071	bool trace_parsing;
	11072	@end example
	11073
	11074	@noindent
	11075	To demonstrate pure handling of parse errors, instead of simply
	11076	dumping them on the standard error output, we will pass them to the
	11077	compiler driver using the following two member functions. Finally, we
	11078	close the class declaration and CPP guard.
	11079
	11080	@comment file: calc++-driver.hh
	11081	@example
	11082	// Error handling.
	11083	void error (const yy::location& l, const std::string& m);
	11084	void error (const std::string& m);
	11085	@};
	11086	#endif // ! CALCXX_DRIVER_HH
	11087	@end example
	11088
	11089	The implementation of the driver is straightforward. The @code{parse}
	11090	member function deserves some attention. The @code{error} functions
	11091	are simple stubs, they should actually register the located error
	11092	messages and set error state.
	11093
	11094	@comment file: calc++-driver.cc
	11095	@example
	11096	#include "calc++-driver.hh"
	11097	#include "calc++-parser.hh"
	11098
	11099	calcxx_driver::calcxx_driver ()
	11100	: trace_scanning (false), trace_parsing (false)
	11101	@{
	11102	variables["one"] = 1;
	11103	variables["two"] = 2;
	11104	@}
	11105
	11106	calcxx_driver::~calcxx_driver ()
	11107	@{
	11108	@}
	11109
	11110	int
	11111	calcxx_driver::parse (const std::string &f)
	11112	@{
	11113	file = f;
	11114	scan_begin ();
	11115	yy::calcxx_parser parser (*this);
	11116	parser.set_debug_level (trace_parsing);
	11117	int res = parser.parse ();
	11118	scan_end ();
	11119	return res;
	11120	@}
	11121
	11122	void
	11123	calcxx_driver::error (const yy::location& l, const std::string& m)
	11124	@{
	11125	std::cerr << l << ": " << m << std::endl;
	11126	@}
	11127
	11128	void
	11129	calcxx_driver::error (const std::string& m)
	11130	@{
	11131	std::cerr << m << std::endl;
	11132	@}
	11133	@end example
	11134
	11135	@node Calc++ Parser
	11136	@subsubsection Calc++ Parser
	11137
	11138	The grammar file @file{calc++-parser.yy} starts by asking for the C++
	11139	deterministic parser skeleton, the creation of the parser header file,
	11140	and specifies the name of the parser class. Because the C++ skeleton
	11141	changed several times, it is safer to require the version you designed
	11142	the grammar for.
	11143
	11144	@comment file: calc++-parser.yy
	11145	@example
	11146	%skeleton "lalr1.cc" /* -- C++ -- */
	11147	%require "@value{VERSION}"
	11148	%defines
	11149	%define parser_class_name "calcxx_parser"
	11150	@end example
	11151
	11152	@noindent
	11153	@findex %define api.token.constructor
	11154	@findex %define api.value.type variant
	11155	This example will use genuine C++ objects as semantic values, therefore, we
	11156	require the variant-based interface. To make sure we properly use it, we
	11157	enable assertions. To fully benefit from type-safety and more natural
	11158	definition of ``symbol'', we enable @code{api.token.constructor}.
	11159
	11160	@comment file: calc++-parser.yy
	11161	@example
	11162	%define api.token.constructor
	11163	%define api.value.type variant
	11164	%define parse.assert
	11165	@end example
	11166
	11167	@noindent
	11168	@findex %code requires
	11169	Then come the declarations/inclusions needed by the semantic values.
	11170	Because the parser uses the parsing driver and reciprocally, both would like
	11171	to include the header of the other, which is, of course, insane. This
	11172	mutual dependency will be broken using forward declarations. Because the
	11173	driver's header needs detailed knowledge about the parser class (in
	11174	particular its inner types), it is the parser's header which will use a
	11175	forward declaration of the driver. @xref{%code Summary}.
	11176
	11177	@comment file: calc++-parser.yy
	11178	@example
	11179	%code requires
	11180	@{
	11181	# include <string>
	11182	class calcxx_driver;
	11183	@}
	11184	@end example
	11185
	11186	@noindent
	11187	The driver is passed by reference to the parser and to the scanner.
	11188	This provides a simple but effective pure interface, not relying on
	11189	global variables.
	11190
	11191	@comment file: calc++-parser.yy
	11192	@example
	11193	// The parsing context.
	11194	%param @{ calcxx_driver& driver @}
	11195	@end example
	11196
	11197	@noindent
	11198	Then we request location tracking, and initialize the
	11199	first location's file name. Afterward new locations are computed
	11200	relatively to the previous locations: the file name will be
	11201	propagated.
	11202
	11203	@comment file: calc++-parser.yy
	11204	@example
	11205	%locations
	11206	%initial-action
	11207	@{
	11208	// Initialize the initial location.
	11209	@@$.begin.filename = @@$.end.filename = &driver.file;
	11210	@};
	11211	@end example
	11212
	11213	@noindent
	11214	Use the following two directives to enable parser tracing and verbose error
	11215	messages. However, verbose error messages can contain incorrect information
	11216	(@pxref{LAC}).
	11217
	11218	@comment file: calc++-parser.yy
	11219	@example
	11220	%define parse.trace
	11221	%define parse.error verbose
	11222	@end example
	11223
	11224	@noindent
	11225	@findex %code
	11226	The code between @samp{%code @{} and @samp{@}} is output in the
	11227	@file{*.cc} file; it needs detailed knowledge about the driver.
	11228
	11229	@comment file: calc++-parser.yy
	11230	@example
	11231	%code
	11232	@{
	11233	# include "calc++-driver.hh"
	11234	@}
	11235	@end example
	11236
	11237
	11238	@noindent
	11239	The token numbered as 0 corresponds to end of file; the following line
	11240	allows for nicer error messages referring to ``end of file'' instead of
	11241	``$end''. Similarly user friendly names are provided for each symbol. To
	11242	avoid name clashes in the generated files (@pxref{Calc++ Scanner}), prefix
	11243	tokens with @code{TOK_} (@pxref{%define Summary,,api.token.prefix}).
	11244
	11245	@comment file: calc++-parser.yy
	11246	@example
	11247	%define api.token.prefix @{TOK_@}
	11248	%token
	11249	END 0 "end of file"
	11250	ASSIGN ":="
	11251	MINUS "-"
	11252	PLUS "+"
	11253	STAR "*"
	11254	SLASH "/"
	11255	LPAREN "("
	11256	RPAREN ")"
	11257	;
	11258	@end example
	11259
	11260	@noindent
	11261	Since we use variant-based semantic values, @code{%union} is not used, and
	11262	both @code{%type} and @code{%token} expect genuine types, as opposed to type
	11263	tags.
	11264
	11265	@comment file: calc++-parser.yy
	11266	@example
	11267	%token <std::string> IDENTIFIER "identifier"
	11268	%token <int> NUMBER "number"
	11269	%type <int> exp
	11270	@end example
	11271
	11272	@noindent
	11273	No @code{%destructor} is needed to enable memory deallocation during error
	11274	recovery; the memory, for strings for instance, will be reclaimed by the
	11275	regular destructors. All the values are printed using their
	11276	@code{operator<<} (@pxref{Printer Decl, , Printing Semantic Values}).
	11277
	11278	@comment file: calc++-parser.yy
	11279	@example
	11280	%printer @{ yyoutput << $$; @} <*>;
	11281	@end example
	11282
	11283	@noindent
	11284	The grammar itself is straightforward (@pxref{Location Tracking Calc, ,
	11285	Location Tracking Calculator: @code{ltcalc}}).
	11286
	11287	@comment file: calc++-parser.yy
	11288	@example
	11289	%%
	11290	%start unit;
	11291	unit: assignments exp @{ driver.result = $2; @};
	11292
	11293	assignments:
	11294	%empty @{@}
	11295	\| assignments assignment @{@};
	11296
	11297	assignment:
	11298	"identifier" ":=" exp @{ driver.variables[$1] = $3; @};
	11299
	11300	%left "+" "-";
	11301	%left "*" "/";
	11302	exp:
	11303	exp "+" exp @{ $$ = $1 + $3; @}
	11304	\| exp "-" exp @{ $$ = $1 - $3; @}
	11305	\| exp "" exp @{ $$ = $1 $3; @}
	11306	\| exp "/" exp @{ $$ = $1 / $3; @}
	11307	\| "(" exp ")" @{ std::swap ($$, $2); @}
	11308	\| "identifier" @{ $$ = driver.variables[$1]; @}
	11309	\| "number" @{ std::swap ($$, $1); @};
	11310	%%
	11311	@end example
	11312
	11313	@noindent
	11314	Finally the @code{error} member function registers the errors to the
	11315	driver.
	11316
	11317	@comment file: calc++-parser.yy
	11318	@example
	11319	void
	11320	yy::calcxx_parser::error (const location_type& l,
	11321	const std::string& m)
	11322	@{
	11323	driver.error (l, m);
	11324	@}
	11325	@end example
	11326
	11327	@node Calc++ Scanner
	11328	@subsubsection Calc++ Scanner
	11329
	11330	The Flex scanner first includes the driver declaration, then the
	11331	parser's to get the set of defined tokens.
	11332
	11333	@comment file: calc++-scanner.ll
	11334	@example
	11335	%@{ /* -- C++ -- */
	11336	# include <cerrno>
	11337	# include <climits>
	11338	# include <cstdlib>
	11339	# include <string>
	11340	# include "calc++-driver.hh"
	11341	# include "calc++-parser.hh"
	11342
	11343	// Work around an incompatibility in flex (at least versions
	11344	// 2.5.31 through 2.5.33): it generates code that does
	11345	// not conform to C89. See Debian bug 333231
	11346	// <http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=333231>.
	11347	# undef yywrap
	11348	# define yywrap() 1
	11349
	11350	// The location of the current token.
	11351	static yy::location loc;
	11352	%@}
	11353	@end example
	11354
	11355	@noindent
	11356	Because there is no @code{#include}-like feature we don't need
	11357	@code{yywrap}, we don't need @code{unput} either, and we parse an
	11358	actual file, this is not an interactive session with the user.
	11359	Finally, we enable scanner tracing.
	11360
	11361	@comment file: calc++-scanner.ll
	11362	@example
	11363	%option noyywrap nounput batch debug noinput
	11364	@end example
	11365
	11366	@noindent
	11367	Abbreviations allow for more readable rules.
	11368
	11369	@comment file: calc++-scanner.ll
	11370	@example
	11371	id [a-zA-Z][a-zA-Z_0-9]*
	11372	int [0-9]+
	11373	blank [ \t]
	11374	@end example
	11375
	11376	@noindent
	11377	The following paragraph suffices to track locations accurately. Each
	11378	time @code{yylex} is invoked, the begin position is moved onto the end
	11379	position. Then when a pattern is matched, its width is added to the end
	11380	column. When matching ends of lines, the end
	11381	cursor is adjusted, and each time blanks are matched, the begin cursor
	11382	is moved onto the end cursor to effectively ignore the blanks
	11383	preceding tokens. Comments would be treated equally.
	11384
	11385	@comment file: calc++-scanner.ll
	11386	@example
	11387	@group
	11388	%@{
	11389	// Code run each time a pattern is matched.
	11390	# define YY_USER_ACTION loc.columns (yyleng);
	11391	%@}
	11392	@end group
	11393	%%
	11394	@group
	11395	%@{
	11396	// Code run each time yylex is called.
	11397	loc.step ();
	11398	%@}
	11399	@end group
	11400	@{blank@}+ loc.step ();
	11401	[\n]+ loc.lines (yyleng); loc.step ();
	11402	@end example
	11403
	11404	@noindent
	11405	The rules are simple. The driver is used to report errors.
	11406
	11407	@comment file: calc++-scanner.ll
	11408	@example
	11409	"-" return yy::calcxx_parser::make_MINUS(loc);
	11410	"+" return yy::calcxx_parser::make_PLUS(loc);
	11411	"*" return yy::calcxx_parser::make_STAR(loc);
	11412	"/" return yy::calcxx_parser::make_SLASH(loc);
	11413	"(" return yy::calcxx_parser::make_LPAREN(loc);
	11414	")" return yy::calcxx_parser::make_RPAREN(loc);
	11415	":=" return yy::calcxx_parser::make_ASSIGN(loc);
	11416
	11417	@group
	11418	@{int@} @{
	11419	errno = 0;
	11420	long n = strtol (yytext, NULL, 10);
	11421	if (! (INT_MIN <= n && n <= INT_MAX && errno != ERANGE))
	11422	driver.error (loc, "integer is out of range");
	11423	return yy::calcxx_parser::make_NUMBER(n, loc);
	11424	@}
	11425	@end group
	11426	@{id@} return yy::calcxx_parser::make_IDENTIFIER(yytext, loc);
	11427	. driver.error (loc, "invalid character");
	11428	<<EOF>> return yy::calcxx_parser::make_END(loc);
	11429	%%
	11430	@end example
	11431
	11432	@noindent
	11433	Finally, because the scanner-related driver's member-functions depend
	11434	on the scanner's data, it is simpler to implement them in this file.
	11435
	11436	@comment file: calc++-scanner.ll
	11437	@example
	11438	@group
	11439	void
	11440	calcxx_driver::scan_begin ()
	11441	@{
	11442	yy_flex_debug = trace_scanning;
	11443	if (file.empty () \|\| file == "-")
	11444	yyin = stdin;
	11445	else if (!(yyin = fopen (file.c_str (), "r")))
	11446	@{
	11447	error ("cannot open " + file + ": " + strerror(errno));
	11448	exit (EXIT_FAILURE);
	11449	@}
	11450	@}
	11451	@end group
	11452
	11453	@group
	11454	void
	11455	calcxx_driver::scan_end ()
	11456	@{
	11457	fclose (yyin);
	11458	@}
	11459	@end group
	11460	@end example
	11461
	11462	@node Calc++ Top Level
	11463	@subsubsection Calc++ Top Level
	11464
	11465	The top level file, @file{calc++.cc}, poses no problem.
	11466
	11467	@comment file: calc++.cc
	11468	@example
	11469	#include <iostream>
	11470	#include "calc++-driver.hh"
	11471
	11472	@group
	11473	int
	11474	main (int argc, char *argv[])
	11475	@{
	11476	int res = 0;
	11477	calcxx_driver driver;
	11478	for (int i = 1; i < argc; ++i)
	11479	if (argv[i] == std::string ("-p"))
	11480	driver.trace_parsing = true;
	11481	else if (argv[i] == std::string ("-s"))
	11482	driver.trace_scanning = true;
	11483	else if (!driver.parse (argv[i]))
	11484	std::cout << driver.result << std::endl;
	11485	else
	11486	res = 1;
	11487	return res;
	11488	@}
	11489	@end group
	11490	@end example
	11491
	11492	@node Java Parsers
	11493	@section Java Parsers
	11494
	11495	@menu
	11496	* Java Bison Interface:: Asking for Java parser generation
	11497	* Java Semantic Values:: %type and %token vs. Java
	11498	* Java Location Values:: The position and location classes
	11499	* Java Parser Interface:: Instantiating and running the parser
	11500	* Java Scanner Interface:: Specifying the scanner for the parser
	11501	* Java Action Features:: Special features for use in actions
	11502	* Java Differences:: Differences between C/C++ and Java Grammars
	11503	* Java Declarations Summary:: List of Bison declarations used with Java
	11504	@end menu
	11505
	11506	@node Java Bison Interface
	11507	@subsection Java Bison Interface
	11508	@c - %language "Java"
	11509
	11510	(The current Java interface is experimental and may evolve.
	11511	More user feedback will help to stabilize it.)
	11512
	11513	The Java parser skeletons are selected using the @code{%language "Java"}
	11514	directive or the @option{-L java}/@option{--language=java} option.
	11515
	11516	@c FIXME: Documented bug.
	11517	When generating a Java parser, @code{bison @var{basename}.y} will
	11518	create a single Java source file named @file{@var{basename}.java}
	11519	containing the parser implementation. Using a grammar file without a
	11520	@file{.y} suffix is currently broken. The basename of the parser
	11521	implementation file can be changed by the @code{%file-prefix}
	11522	directive or the @option{-p}/@option{--name-prefix} option. The
	11523	entire parser implementation file name can be changed by the
	11524	@code{%output} directive or the @option{-o}/@option{--output} option.
	11525	The parser implementation file contains a single class for the parser.
	11526
	11527	You can create documentation for generated parsers using Javadoc.
	11528
	11529	Contrary to C parsers, Java parsers do not use global variables; the
	11530	state of the parser is always local to an instance of the parser class.
	11531	Therefore, all Java parsers are ``pure'', and the @code{%pure-parser}
	11532	and @code{%define api.pure} directives do nothing when used in Java.
	11533
	11534	Push parsers are currently unsupported in Java and @code{%define
	11535	api.push-pull} have no effect.
	11536
	11537	GLR parsers are currently unsupported in Java. Do not use the
	11538	@code{glr-parser} directive.
	11539
	11540	No header file can be generated for Java parsers. Do not use the
	11541	@code{%defines} directive or the @option{-d}/@option{--defines} options.
	11542
	11543	@c FIXME: Possible code change.
	11544	Currently, support for tracing is always compiled
	11545	in. Thus the @samp{%define parse.trace} and @samp{%token-table}
	11546	directives and the
	11547	@option{-t}/@option{--debug} and @option{-k}/@option{--token-table}
	11548	options have no effect. This may change in the future to eliminate
	11549	unused code in the generated parser, so use @samp{%define parse.trace}
	11550	explicitly
	11551	if needed. Also, in the future the
	11552	@code{%token-table} directive might enable a public interface to
	11553	access the token names and codes.
	11554
	11555	Getting a ``code too large'' error from the Java compiler means the code
	11556	hit the 64KB bytecode per method limitation of the Java class file.
	11557	Try reducing the amount of code in actions and static initializers;
	11558	otherwise, report a bug so that the parser skeleton will be improved.
	11559
	11560
	11561	@node Java Semantic Values
	11562	@subsection Java Semantic Values
	11563	@c - No %union, specify type in %type/%token.
	11564	@c - YYSTYPE
	11565	@c - Printer and destructor
	11566
	11567	There is no @code{%union} directive in Java parsers. Instead, the
	11568	semantic values' types (class names) should be specified in the
	11569	@code{%type} or @code{%token} directive:
	11570
	11571	@example
	11572	%type <Expression> expr assignment_expr term factor
	11573	%type <Integer> number
	11574	@end example
	11575
	11576	By default, the semantic stack is declared to have @code{Object} members,
	11577	which means that the class types you specify can be of any class.
	11578	To improve the type safety of the parser, you can declare the common
	11579	superclass of all the semantic values using the @samp{%define api.value.type}
	11580	directive. For example, after the following declaration:
	11581
	11582	@example
	11583	%define api.value.type "ASTNode"
	11584	@end example
	11585
	11586	@noindent
	11587	any @code{%type} or @code{%token} specifying a semantic type which
	11588	is not a subclass of ASTNode, will cause a compile-time error.
	11589
	11590	@c FIXME: Documented bug.
	11591	Types used in the directives may be qualified with a package name.
	11592	Primitive data types are accepted for Java version 1.5 or later. Note
	11593	that in this case the autoboxing feature of Java 1.5 will be used.
	11594	Generic types may not be used; this is due to a limitation in the
	11595	implementation of Bison, and may change in future releases.
	11596
	11597	Java parsers do not support @code{%destructor}, since the language
	11598	adopts garbage collection. The parser will try to hold references
	11599	to semantic values for as little time as needed.
	11600
	11601	Java parsers do not support @code{%printer}, as @code{toString()}
	11602	can be used to print the semantic values. This however may change
	11603	(in a backwards-compatible way) in future versions of Bison.
	11604
	11605
	11606	@node Java Location Values
	11607	@subsection Java Location Values
	11608	@c - %locations
	11609	@c - class Position
	11610	@c - class Location
	11611
	11612	When the directive @code{%locations} is used, the Java parser supports
	11613	location tracking, see @ref{Tracking Locations}. An auxiliary user-defined
	11614	class defines a @dfn{position}, a single point in a file; Bison itself
	11615	defines a class representing a @dfn{location}, a range composed of a pair of
	11616	positions (possibly spanning several files). The location class is an inner
	11617	class of the parser; the name is @code{Location} by default, and may also be
	11618	renamed using @code{%define api.location.type "@var{class-name}"}.
	11619
	11620	The location class treats the position as a completely opaque value.
	11621	By default, the class name is @code{Position}, but this can be changed
	11622	with @code{%define api.position.type "@var{class-name}"}. This class must
	11623	be supplied by the user.
	11624
	11625
	11626	@deftypeivar {Location} {Position} begin
	11627	@deftypeivarx {Location} {Position} end
	11628	The first, inclusive, position of the range, and the first beyond.
	11629	@end deftypeivar
	11630
	11631	@deftypeop {Constructor} {Location} {} Location (Position @var{loc})
	11632	Create a @code{Location} denoting an empty range located at a given point.
	11633	@end deftypeop
	11634
	11635	@deftypeop {Constructor} {Location} {} Location (Position @var{begin}, Position @var{end})
	11636	Create a @code{Location} from the endpoints of the range.
	11637	@end deftypeop
	11638
	11639	@deftypemethod {Location} {String} toString ()
	11640	Prints the range represented by the location. For this to work
	11641	properly, the position class should override the @code{equals} and
	11642	@code{toString} methods appropriately.
	11643	@end deftypemethod
	11644
	11645
	11646	@node Java Parser Interface
	11647	@subsection Java Parser Interface
	11648	@c - define parser_class_name
	11649	@c - Ctor
	11650	@c - parse, error, set_debug_level, debug_level, set_debug_stream,
	11651	@c debug_stream.
	11652	@c - Reporting errors
	11653
	11654	The name of the generated parser class defaults to @code{YYParser}. The
	11655	@code{YY} prefix may be changed using the @code{%name-prefix} directive
	11656	or the @option{-p}/@option{--name-prefix} option. Alternatively, use
	11657	@samp{%define parser_class_name "@var{name}"} to give a custom name to
	11658	the class. The interface of this class is detailed below.
	11659
	11660	By default, the parser class has package visibility. A declaration
	11661	@samp{%define public} will change to public visibility. Remember that,
	11662	according to the Java language specification, the name of the @file{.java}
	11663	file should match the name of the class in this case. Similarly, you can
	11664	use @code{abstract}, @code{final} and @code{strictfp} with the
	11665	@code{%define} declaration to add other modifiers to the parser class.
	11666	A single @samp{%define annotations "@var{annotations}"} directive can
	11667	be used to add any number of annotations to the parser class.
	11668
	11669	The Java package name of the parser class can be specified using the
	11670	@samp{%define package} directive. The superclass and the implemented
	11671	interfaces of the parser class can be specified with the @code{%define
	11672	extends} and @samp{%define implements} directives.
	11673
	11674	The parser class defines an inner class, @code{Location}, that is used
	11675	for location tracking (see @ref{Java Location Values}), and a inner
	11676	interface, @code{Lexer} (see @ref{Java Scanner Interface}). Other than
	11677	these inner class/interface, and the members described in the interface
	11678	below, all the other members and fields are preceded with a @code{yy} or
	11679	@code{YY} prefix to avoid clashes with user code.
	11680
	11681	The parser class can be extended using the @code{%parse-param}
	11682	directive. Each occurrence of the directive will add a @code{protected
	11683	final} field to the parser class, and an argument to its constructor,
	11684	which initialize them automatically.
	11685
	11686	@deftypeop {Constructor} {YYParser} {} YYParser (@var{lex_param}, @dots{}, @var{parse_param}, @dots{})
	11687	Build a new parser object with embedded @code{%code lexer}. There are
	11688	no parameters, unless @code{%param}s and/or @code{%parse-param}s and/or
	11689	@code{%lex-param}s are used.
	11690
	11691	Use @code{%code init} for code added to the start of the constructor
	11692	body. This is especially useful to initialize superclasses. Use
	11693	@samp{%define init_throws} to specify any uncaught exceptions.
	11694	@end deftypeop
	11695
	11696	@deftypeop {Constructor} {YYParser} {} YYParser (Lexer @var{lexer}, @var{parse_param}, @dots{})
	11697	Build a new parser object using the specified scanner. There are no
	11698	additional parameters unless @code{%param}s and/or @code{%parse-param}s are
	11699	used.
	11700
	11701	If the scanner is defined by @code{%code lexer}, this constructor is
	11702	declared @code{protected} and is called automatically with a scanner
	11703	created with the correct @code{%param}s and/or @code{%lex-param}s.
	11704
	11705	Use @code{%code init} for code added to the start of the constructor
	11706	body. This is especially useful to initialize superclasses. Use
	11707	@samp{%define init_throws} to specify any uncaught exceptions.
	11708	@end deftypeop
	11709
	11710	@deftypemethod {YYParser} {boolean} parse ()
	11711	Run the syntactic analysis, and return @code{true} on success,
	11712	@code{false} otherwise.
	11713	@end deftypemethod
	11714
	11715	@deftypemethod {YYParser} {boolean} getErrorVerbose ()
	11716	@deftypemethodx {YYParser} {void} setErrorVerbose (boolean @var{verbose})
	11717	Get or set the option to produce verbose error messages. These are only
	11718	available with @samp{%define parse.error verbose}, which also turns on
	11719	verbose error messages.
	11720	@end deftypemethod
	11721
	11722	@deftypemethod {YYParser} {void} yyerror (String @var{msg})
	11723	@deftypemethodx {YYParser} {void} yyerror (Position @var{pos}, String @var{msg})
	11724	@deftypemethodx {YYParser} {void} yyerror (Location @var{loc}, String @var{msg})
	11725	Print an error message using the @code{yyerror} method of the scanner
	11726	instance in use. The @code{Location} and @code{Position} parameters are
	11727	available only if location tracking is active.
	11728	@end deftypemethod
	11729
	11730	@deftypemethod {YYParser} {boolean} recovering ()
	11731	During the syntactic analysis, return @code{true} if recovering
	11732	from a syntax error.
	11733	@xref{Error Recovery}.
	11734	@end deftypemethod
	11735
	11736	@deftypemethod {YYParser} {java.io.PrintStream} getDebugStream ()
	11737	@deftypemethodx {YYParser} {void} setDebugStream (java.io.printStream @var{o})
	11738	Get or set the stream used for tracing the parsing. It defaults to
	11739	@code{System.err}.
	11740	@end deftypemethod
	11741
	11742	@deftypemethod {YYParser} {int} getDebugLevel ()
	11743	@deftypemethodx {YYParser} {void} setDebugLevel (int @var{l})
	11744	Get or set the tracing level. Currently its value is either 0, no trace,
	11745	or nonzero, full tracing.
	11746	@end deftypemethod
	11747
	11748	@deftypecv {Constant} {YYParser} {String} {bisonVersion}
	11749	@deftypecvx {Constant} {YYParser} {String} {bisonSkeleton}
	11750	Identify the Bison version and skeleton used to generate this parser.
	11751	@end deftypecv
	11752
	11753
	11754	@node Java Scanner Interface
	11755	@subsection Java Scanner Interface
	11756	@c - %code lexer
	11757	@c - %lex-param
	11758	@c - Lexer interface
	11759
	11760	There are two possible ways to interface a Bison-generated Java parser
	11761	with a scanner: the scanner may be defined by @code{%code lexer}, or
	11762	defined elsewhere. In either case, the scanner has to implement the
	11763	@code{Lexer} inner interface of the parser class. This interface also
	11764	contain constants for all user-defined token names and the predefined
	11765	@code{EOF} token.
	11766
	11767	In the first case, the body of the scanner class is placed in
	11768	@code{%code lexer} blocks. If you want to pass parameters from the
	11769	parser constructor to the scanner constructor, specify them with
	11770	@code{%lex-param}; they are passed before @code{%parse-param}s to the
	11771	constructor.
	11772
	11773	In the second case, the scanner has to implement the @code{Lexer} interface,
	11774	which is defined within the parser class (e.g., @code{YYParser.Lexer}).
	11775	The constructor of the parser object will then accept an object
	11776	implementing the interface; @code{%lex-param} is not used in this
	11777	case.
	11778
	11779	In both cases, the scanner has to implement the following methods.
	11780
	11781	@deftypemethod {Lexer} {void} yyerror (Location @var{loc}, String @var{msg})
	11782	This method is defined by the user to emit an error message. The first
	11783	parameter is omitted if location tracking is not active. Its type can be
	11784	changed using @code{%define api.location.type "@var{class-name}".}
	11785	@end deftypemethod
	11786
	11787	@deftypemethod {Lexer} {int} yylex ()
	11788	Return the next token. Its type is the return value, its semantic
	11789	value and location are saved and returned by the their methods in the
	11790	interface.
	11791
	11792	Use @samp{%define lex_throws} to specify any uncaught exceptions.
	11793	Default is @code{java.io.IOException}.
	11794	@end deftypemethod
	11795
	11796	@deftypemethod {Lexer} {Position} getStartPos ()
	11797	@deftypemethodx {Lexer} {Position} getEndPos ()
	11798	Return respectively the first position of the last token that
	11799	@code{yylex} returned, and the first position beyond it. These
	11800	methods are not needed unless location tracking is active.
	11801
	11802	The return type can be changed using @code{%define api.position.type
	11803	"@var{class-name}".}
	11804	@end deftypemethod
	11805
	11806	@deftypemethod {Lexer} {Object} getLVal ()
	11807	Return the semantic value of the last token that yylex returned.
	11808
	11809	The return type can be changed using @samp{%define api.value.type
	11810	"@var{class-name}".}
	11811	@end deftypemethod
	11812
	11813
	11814	@node Java Action Features
	11815	@subsection Special Features for Use in Java Actions
	11816
	11817	The following special constructs can be uses in Java actions.
	11818	Other analogous C action features are currently unavailable for Java.
	11819
	11820	Use @samp{%define throws} to specify any uncaught exceptions from parser
	11821	actions, and initial actions specified by @code{%initial-action}.
	11822
	11823	@defvar $@var{n}
	11824	The semantic value for the @var{n}th component of the current rule.
	11825	This may not be assigned to.
	11826	@xref{Java Semantic Values}.
	11827	@end defvar
	11828
	11829	@defvar $<@var{typealt}>@var{n}
	11830	Like @code{$@var{n}} but specifies a alternative type @var{typealt}.
	11831	@xref{Java Semantic Values}.
	11832	@end defvar
	11833
	11834	@defvar $$
	11835	The semantic value for the grouping made by the current rule. As a
	11836	value, this is in the base type (@code{Object} or as specified by
	11837	@samp{%define api.value.type}) as in not cast to the declared subtype because
	11838	casts are not allowed on the left-hand side of Java assignments.
	11839	Use an explicit Java cast if the correct subtype is needed.
	11840	@xref{Java Semantic Values}.
	11841	@end defvar
	11842
	11843	@defvar $<@var{typealt}>$
	11844	Same as @code{$$} since Java always allow assigning to the base type.
	11845	Perhaps we should use this and @code{$<>$} for the value and @code{$$}
	11846	for setting the value but there is currently no easy way to distinguish
	11847	these constructs.
	11848	@xref{Java Semantic Values}.
	11849	@end defvar
	11850
	11851	@defvar @@@var{n}
	11852	The location information of the @var{n}th component of the current rule.
	11853	This may not be assigned to.
	11854	@xref{Java Location Values}.
	11855	@end defvar
	11856
	11857	@defvar @@$
	11858	The location information of the grouping made by the current rule.
	11859	@xref{Java Location Values}.
	11860	@end defvar
	11861
	11862	@deftypefn {Statement} return YYABORT @code{;}
	11863	Return immediately from the parser, indicating failure.
	11864	@xref{Java Parser Interface}.
	11865	@end deftypefn
	11866
	11867	@deftypefn {Statement} return YYACCEPT @code{;}
	11868	Return immediately from the parser, indicating success.
	11869	@xref{Java Parser Interface}.
	11870	@end deftypefn
	11871
	11872	@deftypefn {Statement} {return} YYERROR @code{;}
	11873	Start error recovery (without printing an error message).
	11874	@xref{Error Recovery}.
	11875	@end deftypefn
	11876
	11877	@deftypefn {Function} {boolean} recovering ()
	11878	Return whether error recovery is being done. In this state, the parser
	11879	reads token until it reaches a known state, and then restarts normal
	11880	operation.
	11881	@xref{Error Recovery}.
	11882	@end deftypefn
	11883
	11884	@deftypefn {Function} {void} yyerror (String @var{msg})
	11885	@deftypefnx {Function} {void} yyerror (Position @var{loc}, String @var{msg})
	11886	@deftypefnx {Function} {void} yyerror (Location @var{loc}, String @var{msg})
	11887	Print an error message using the @code{yyerror} method of the scanner
	11888	instance in use. The @code{Location} and @code{Position} parameters are
	11889	available only if location tracking is active.
	11890	@end deftypefn
	11891
	11892
	11893	@node Java Differences
	11894	@subsection Differences between C/C++ and Java Grammars
	11895
	11896	The different structure of the Java language forces several differences
	11897	between C/C++ grammars, and grammars designed for Java parsers. This
	11898	section summarizes these differences.
	11899
	11900	@itemize
	11901	@item
	11902	Java lacks a preprocessor, so the @code{YYERROR}, @code{YYACCEPT},
	11903	@code{YYABORT} symbols (@pxref{Table of Symbols}) cannot obviously be
	11904	macros. Instead, they should be preceded by @code{return} when they
	11905	appear in an action. The actual definition of these symbols is
	11906	opaque to the Bison grammar, and it might change in the future. The
	11907	only meaningful operation that you can do, is to return them.
	11908	@xref{Java Action Features}.
	11909
	11910	Note that of these three symbols, only @code{YYACCEPT} and
	11911	@code{YYABORT} will cause a return from the @code{yyparse}
	11912	method@footnote{Java parsers include the actions in a separate
	11913	method than @code{yyparse} in order to have an intuitive syntax that
	11914	corresponds to these C macros.}.
	11915
	11916	@item
	11917	Java lacks unions, so @code{%union} has no effect. Instead, semantic
	11918	values have a common base type: @code{Object} or as specified by
	11919	@samp{%define api.value.type}. Angle brackets on @code{%token}, @code{type},
	11920	@code{$@var{n}} and @code{$$} specify subtypes rather than fields of
	11921	an union. The type of @code{$$}, even with angle brackets, is the base
	11922	type since Java casts are not allow on the left-hand side of assignments.
	11923	Also, @code{$@var{n}} and @code{@@@var{n}} are not allowed on the
	11924	left-hand side of assignments. @xref{Java Semantic Values}, and
	11925	@ref{Java Action Features}.
	11926
	11927	@item
	11928	The prologue declarations have a different meaning than in C/C++ code.
	11929	@table @asis
	11930	@item @code{%code imports}
	11931	blocks are placed at the beginning of the Java source code. They may
	11932	include copyright notices. For a @code{package} declarations, it is
	11933	suggested to use @samp{%define package} instead.
	11934
	11935	@item unqualified @code{%code}
	11936	blocks are placed inside the parser class.
	11937
	11938	@item @code{%code lexer}
	11939	blocks, if specified, should include the implementation of the
	11940	scanner. If there is no such block, the scanner can be any class
	11941	that implements the appropriate interface (@pxref{Java Scanner
	11942	Interface}).
	11943	@end table
	11944
	11945	Other @code{%code} blocks are not supported in Java parsers.
	11946	In particular, @code{%@{ @dots{} %@}} blocks should not be used
	11947	and may give an error in future versions of Bison.
	11948
	11949	The epilogue has the same meaning as in C/C++ code and it can
	11950	be used to define other classes used by the parser @emph{outside}
	11951	the parser class.
	11952	@end itemize
	11953
	11954
	11955	@node Java Declarations Summary
	11956	@subsection Java Declarations Summary
	11957
	11958	This summary only include declarations specific to Java or have special
	11959	meaning when used in a Java parser.
	11960
	11961	@deffn {Directive} {%language "Java"}
	11962	Generate a Java class for the parser.
	11963	@end deffn
	11964
	11965	@deffn {Directive} %lex-param @{@var{type} @var{name}@}
	11966	A parameter for the lexer class defined by @code{%code lexer}
	11967	@emph{only}, added as parameters to the lexer constructor and the parser
	11968	constructor that @emph{creates} a lexer. Default is none.
	11969	@xref{Java Scanner Interface}.
	11970	@end deffn
	11971
	11972	@deffn {Directive} %name-prefix "@var{prefix}"
	11973	The prefix of the parser class name @code{@var{prefix}Parser} if
	11974	@samp{%define parser_class_name} is not used. Default is @code{YY}.
	11975	@xref{Java Bison Interface}.
	11976	@end deffn
	11977
	11978	@deffn {Directive} %parse-param @{@var{type} @var{name}@}
	11979	A parameter for the parser class added as parameters to constructor(s)
	11980	and as fields initialized by the constructor(s). Default is none.
	11981	@xref{Java Parser Interface}.
	11982	@end deffn
	11983
	11984	@deffn {Directive} %token <@var{type}> @var{token} @dots{}
	11985	Declare tokens. Note that the angle brackets enclose a Java @emph{type}.
	11986	@xref{Java Semantic Values}.
	11987	@end deffn
	11988
	11989	@deffn {Directive} %type <@var{type}> @var{nonterminal} @dots{}
	11990	Declare the type of nonterminals. Note that the angle brackets enclose
	11991	a Java @emph{type}.
	11992	@xref{Java Semantic Values}.
	11993	@end deffn
	11994
	11995	@deffn {Directive} %code @{ @var{code} @dots{} @}
	11996	Code appended to the inside of the parser class.
	11997	@xref{Java Differences}.
	11998	@end deffn
	11999
	12000	@deffn {Directive} {%code imports} @{ @var{code} @dots{} @}
	12001	Code inserted just after the @code{package} declaration.
	12002	@xref{Java Differences}.
	12003	@end deffn
	12004
	12005	@deffn {Directive} {%code init} @{ @var{code} @dots{} @}
	12006	Code inserted at the beginning of the parser constructor body.
	12007	@xref{Java Parser Interface}.
	12008	@end deffn
	12009
	12010	@deffn {Directive} {%code lexer} @{ @var{code} @dots{} @}
	12011	Code added to the body of a inner lexer class within the parser class.
	12012	@xref{Java Scanner Interface}.
	12013	@end deffn
	12014
	12015	@deffn {Directive} %% @var{code} @dots{}
	12016	Code (after the second @code{%%}) appended to the end of the file,
	12017	@emph{outside} the parser class.
	12018	@xref{Java Differences}.
	12019	@end deffn
	12020
	12021	@deffn {Directive} %@{ @var{code} @dots{} %@}
	12022	Not supported. Use @code{%code imports} instead.
	12023	@xref{Java Differences}.
	12024	@end deffn
	12025
	12026	@deffn {Directive} {%define abstract}
	12027	Whether the parser class is declared @code{abstract}. Default is false.
	12028	@xref{Java Bison Interface}.
	12029	@end deffn
	12030
	12031	@deffn {Directive} {%define annotations} "@var{annotations}"
	12032	The Java annotations for the parser class. Default is none.
	12033	@xref{Java Bison Interface}.
	12034	@end deffn
	12035
	12036	@deffn {Directive} {%define extends} "@var{superclass}"
	12037	The superclass of the parser class. Default is none.
	12038	@xref{Java Bison Interface}.
	12039	@end deffn
	12040
	12041	@deffn {Directive} {%define final}
	12042	Whether the parser class is declared @code{final}. Default is false.
	12043	@xref{Java Bison Interface}.
	12044	@end deffn
	12045
	12046	@deffn {Directive} {%define implements} "@var{interfaces}"
	12047	The implemented interfaces of the parser class, a comma-separated list.
	12048	Default is none.
	12049	@xref{Java Bison Interface}.
	12050	@end deffn
	12051
	12052	@deffn {Directive} {%define init_throws} "@var{exceptions}"
	12053	The exceptions thrown by @code{%code init} from the parser class
	12054	constructor. Default is none.
	12055	@xref{Java Parser Interface}.
	12056	@end deffn
	12057
	12058	@deffn {Directive} {%define lex_throws} "@var{exceptions}"
	12059	The exceptions thrown by the @code{yylex} method of the lexer, a
	12060	comma-separated list. Default is @code{java.io.IOException}.
	12061	@xref{Java Scanner Interface}.
	12062	@end deffn
	12063
	12064	@deffn {Directive} {%define api.location.type} "@var{class}"
	12065	The name of the class used for locations (a range between two
	12066	positions). This class is generated as an inner class of the parser
	12067	class by @command{bison}. Default is @code{Location}.
	12068	Formerly named @code{location_type}.
	12069	@xref{Java Location Values}.
	12070	@end deffn
	12071
	12072	@deffn {Directive} {%define package} "@var{package}"
	12073	The package to put the parser class in. Default is none.
	12074	@xref{Java Bison Interface}.
	12075	@end deffn
	12076
	12077	@deffn {Directive} {%define parser_class_name} "@var{name}"
	12078	The name of the parser class. Default is @code{YYParser} or
	12079	@code{@var{name-prefix}Parser}.
	12080	@xref{Java Bison Interface}.
	12081	@end deffn
	12082
	12083	@deffn {Directive} {%define api.position.type} "@var{class}"
	12084	The name of the class used for positions. This class must be supplied by
	12085	the user. Default is @code{Position}.
	12086	Formerly named @code{position_type}.
	12087	@xref{Java Location Values}.
	12088	@end deffn
	12089
	12090	@deffn {Directive} {%define public}
	12091	Whether the parser class is declared @code{public}. Default is false.
	12092	@xref{Java Bison Interface}.
	12093	@end deffn
	12094
	12095	@deffn {Directive} {%define api.value.type} "@var{class}"
	12096	The base type of semantic values. Default is @code{Object}.
	12097	@xref{Java Semantic Values}.
	12098	@end deffn
	12099
	12100	@deffn {Directive} {%define strictfp}
	12101	Whether the parser class is declared @code{strictfp}. Default is false.
	12102	@xref{Java Bison Interface}.
	12103	@end deffn
	12104
	12105	@deffn {Directive} {%define throws} "@var{exceptions}"
	12106	The exceptions thrown by user-supplied parser actions and
	12107	@code{%initial-action}, a comma-separated list. Default is none.
	12108	@xref{Java Parser Interface}.
	12109	@end deffn
	12110
	12111
	12112	@c ================================================= FAQ
	12113
	12114	@node FAQ
	12115	@chapter Frequently Asked Questions
	12116	@cindex frequently asked questions
	12117	@cindex questions
	12118
	12119	Several questions about Bison come up occasionally. Here some of them
	12120	are addressed.
	12121
	12122	@menu
	12123	* Memory Exhausted:: Breaking the Stack Limits
	12124	* How Can I Reset the Parser:: @code{yyparse} Keeps some State
	12125	* Strings are Destroyed:: @code{yylval} Loses Track of Strings
	12126	* Implementing Gotos/Loops:: Control Flow in the Calculator
	12127	* Multiple start-symbols:: Factoring closely related grammars
	12128	* Secure? Conform?:: Is Bison POSIX safe?
	12129	* I can't build Bison:: Troubleshooting
	12130	* Where can I find help?:: Troubleshouting
	12131	* Bug Reports:: Troublereporting
	12132	* More Languages:: Parsers in C++, Java, and so on
	12133	* Beta Testing:: Experimenting development versions
	12134	* Mailing Lists:: Meeting other Bison users
	12135	@end menu
	12136
	12137	@node Memory Exhausted
	12138	@section Memory Exhausted
	12139
	12140	@quotation
	12141	My parser returns with error with a @samp{memory exhausted}
	12142	message. What can I do?
	12143	@end quotation
	12144
	12145	This question is already addressed elsewhere, see @ref{Recursion, ,Recursive
	12146	Rules}.
	12147
	12148	@node How Can I Reset the Parser
	12149	@section How Can I Reset the Parser
	12150
	12151	The following phenomenon has several symptoms, resulting in the
	12152	following typical questions:
	12153
	12154	@quotation
	12155	I invoke @code{yyparse} several times, and on correct input it works
	12156	properly; but when a parse error is found, all the other calls fail
	12157	too. How can I reset the error flag of @code{yyparse}?
	12158	@end quotation
	12159
	12160	@noindent
	12161	or
	12162
	12163	@quotation
	12164	My parser includes support for an @samp{#include}-like feature, in
	12165	which case I run @code{yyparse} from @code{yyparse}. This fails
	12166	although I did specify @samp{%define api.pure full}.
	12167	@end quotation
	12168
	12169	These problems typically come not from Bison itself, but from
	12170	Lex-generated scanners. Because these scanners use large buffers for
	12171	speed, they might not notice a change of input file. As a
	12172	demonstration, consider the following source file,
	12173	@file{first-line.l}:
	12174
	12175	@example
	12176	@group
	12177	%@{
	12178	#include <stdio.h>
	12179	#include <stdlib.h>
	12180	%@}
	12181	@end group
	12182	%%
	12183	.*\n ECHO; return 1;
	12184	%%
	12185	@group
	12186	int
	12187	yyparse (char const *file)
	12188	@{
	12189	yyin = fopen (file, "r");
	12190	if (!yyin)
	12191	@{
	12192	perror ("fopen");
	12193	exit (EXIT_FAILURE);
	12194	@}
	12195	@end group
	12196	@group
	12197	/* One token only. */
	12198	yylex ();
	12199	if (fclose (yyin) != 0)
	12200	@{
	12201	perror ("fclose");
	12202	exit (EXIT_FAILURE);
	12203	@}
	12204	return 0;
	12205	@}
	12206	@end group
	12207
	12208	@group
	12209	int
	12210	main (void)
	12211	@{
	12212	yyparse ("input");
	12213	yyparse ("input");
	12214	return 0;
	12215	@}
	12216	@end group
	12217	@end example
	12218
	12219	@noindent
	12220	If the file @file{input} contains
	12221
	12222	@example
	12223	input:1: Hello,
	12224	input:2: World!
	12225	@end example
	12226
	12227	@noindent
	12228	then instead of getting the first line twice, you get:
	12229
	12230	@example
	12231	$ @kbd{flex -ofirst-line.c first-line.l}
	12232	$ @kbd{gcc -ofirst-line first-line.c -ll}
	12233	$ @kbd{./first-line}
	12234	input:1: Hello,
	12235	input:2: World!
	12236	@end example
	12237
	12238	Therefore, whenever you change @code{yyin}, you must tell the
	12239	Lex-generated scanner to discard its current buffer and switch to the
	12240	new one. This depends upon your implementation of Lex; see its
	12241	documentation for more. For Flex, it suffices to call
	12242	@samp{YY_FLUSH_BUFFER} after each change to @code{yyin}. If your
	12243	Flex-generated scanner needs to read from several input streams to
	12244	handle features like include files, you might consider using Flex
	12245	functions like @samp{yy_switch_to_buffer} that manipulate multiple
	12246	input buffers.
	12247
	12248	If your Flex-generated scanner uses start conditions (@pxref{Start
	12249	conditions, , Start conditions, flex, The Flex Manual}), you might
	12250	also want to reset the scanner's state, i.e., go back to the initial
	12251	start condition, through a call to @samp{BEGIN (0)}.
	12252
	12253	@node Strings are Destroyed
	12254	@section Strings are Destroyed
	12255
	12256	@quotation
	12257	My parser seems to destroy old strings, or maybe it loses track of
	12258	them. Instead of reporting @samp{"foo", "bar"}, it reports
	12259	@samp{"bar", "bar"}, or even @samp{"foo\nbar", "bar"}.
	12260	@end quotation
	12261
	12262	This error is probably the single most frequent ``bug report'' sent to
	12263	Bison lists, but is only concerned with a misunderstanding of the role
	12264	of the scanner. Consider the following Lex code:
	12265
	12266	@example
	12267	@group
	12268	%@{
	12269	#include <stdio.h>
	12270	char *yylval = NULL;
	12271	%@}
	12272	@end group
	12273	@group
	12274	%%
	12275	.* yylval = yytext; return 1;
	12276	\n /* IGNORE */
	12277	%%
	12278	@end group
	12279	@group
	12280	int
	12281	main ()
	12282	@{
	12283	/* Similar to using $1, $2 in a Bison action. */
	12284	char *fst = (yylex (), yylval);
	12285	char *snd = (yylex (), yylval);
	12286	printf ("\"%s\", \"%s\"\n", fst, snd);
	12287	return 0;
	12288	@}
	12289	@end group
	12290	@end example
	12291
	12292	If you compile and run this code, you get:
	12293
	12294	@example
	12295	$ @kbd{flex -osplit-lines.c split-lines.l}
	12296	$ @kbd{gcc -osplit-lines split-lines.c -ll}
	12297	$ @kbd{printf 'one\ntwo\n' \| ./split-lines}
	12298	"one
	12299	two", "two"
	12300	@end example
	12301
	12302	@noindent
	12303	this is because @code{yytext} is a buffer provided for @emph{reading}
	12304	in the action, but if you want to keep it, you have to duplicate it
	12305	(e.g., using @code{strdup}). Note that the output may depend on how
	12306	your implementation of Lex handles @code{yytext}. For instance, when
	12307	given the Lex compatibility option @option{-l} (which triggers the
	12308	option @samp{%array}) Flex generates a different behavior:
	12309
	12310	@example
	12311	$ @kbd{flex -l -osplit-lines.c split-lines.l}
	12312	$ @kbd{gcc -osplit-lines split-lines.c -ll}
	12313	$ @kbd{printf 'one\ntwo\n' \| ./split-lines}
	12314	"two", "two"
	12315	@end example
	12316
	12317
	12318	@node Implementing Gotos/Loops
	12319	@section Implementing Gotos/Loops
	12320
	12321	@quotation
	12322	My simple calculator supports variables, assignments, and functions,
	12323	but how can I implement gotos, or loops?
	12324	@end quotation
	12325
	12326	Although very pedagogical, the examples included in the document blur
	12327	the distinction to make between the parser---whose job is to recover
	12328	the structure of a text and to transmit it to subsequent modules of
	12329	the program---and the processing (such as the execution) of this
	12330	structure. This works well with so called straight line programs,
	12331	i.e., precisely those that have a straightforward execution model:
	12332	execute simple instructions one after the others.
	12333
	12334	@cindex abstract syntax tree
	12335	@cindex AST
	12336	If you want a richer model, you will probably need to use the parser
	12337	to construct a tree that does represent the structure it has
	12338	recovered; this tree is usually called the @dfn{abstract syntax tree},
	12339	or @dfn{AST} for short. Then, walking through this tree,
	12340	traversing it in various ways, will enable treatments such as its
	12341	execution or its translation, which will result in an interpreter or a
	12342	compiler.
	12343
	12344	This topic is way beyond the scope of this manual, and the reader is
	12345	invited to consult the dedicated literature.
	12346
	12347
	12348	@node Multiple start-symbols
	12349	@section Multiple start-symbols
	12350
	12351	@quotation
	12352	I have several closely related grammars, and I would like to share their
	12353	implementations. In fact, I could use a single grammar but with
	12354	multiple entry points.
	12355	@end quotation
	12356
	12357	Bison does not support multiple start-symbols, but there is a very
	12358	simple means to simulate them. If @code{foo} and @code{bar} are the two
	12359	pseudo start-symbols, then introduce two new tokens, say
	12360	@code{START_FOO} and @code{START_BAR}, and use them as switches from the
	12361	real start-symbol:
	12362
	12363	@example
	12364	%token START_FOO START_BAR;
	12365	%start start;
	12366	start:
	12367	START_FOO foo
	12368	\| START_BAR bar;
	12369	@end example
	12370
	12371	These tokens prevents the introduction of new conflicts. As far as the
	12372	parser goes, that is all that is needed.
	12373
	12374	Now the difficult part is ensuring that the scanner will send these
	12375	tokens first. If your scanner is hand-written, that should be
	12376	straightforward. If your scanner is generated by Lex, them there is
	12377	simple means to do it: recall that anything between @samp{%@{ ... %@}}
	12378	after the first @code{%%} is copied verbatim in the top of the generated
	12379	@code{yylex} function. Make sure a variable @code{start_token} is
	12380	available in the scanner (e.g., a global variable or using
	12381	@code{%lex-param} etc.), and use the following:
	12382
	12383	@example
	12384	/* @r{Prologue.} */
	12385	%%
	12386	%@{
	12387	if (start_token)
	12388	@{
	12389	int t = start_token;
	12390	start_token = 0;
	12391	return t;
	12392	@}
	12393	%@}
	12394	/* @r{The rules.} */
	12395	@end example
	12396
	12397
	12398	@node Secure? Conform?
	12399	@section Secure? Conform?
	12400
	12401	@quotation
	12402	Is Bison secure? Does it conform to POSIX?
	12403	@end quotation
	12404
	12405	If you're looking for a guarantee or certification, we don't provide it.
	12406	However, Bison is intended to be a reliable program that conforms to the
	12407	POSIX specification for Yacc. If you run into problems,
	12408	please send us a bug report.
	12409
	12410	@node I can't build Bison
	12411	@section I can't build Bison
	12412
	12413	@quotation
	12414	I can't build Bison because @command{make} complains that
	12415	@code{msgfmt} is not found.
	12416	What should I do?
	12417	@end quotation
	12418
	12419	Like most GNU packages with internationalization support, that feature
	12420	is turned on by default. If you have problems building in the @file{po}
	12421	subdirectory, it indicates that your system's internationalization
	12422	support is lacking. You can re-configure Bison with
	12423	@option{--disable-nls} to turn off this support, or you can install GNU
	12424	gettext from @url{ftp://ftp.gnu.org/gnu/gettext/} and re-configure
	12425	Bison. See the file @file{ABOUT-NLS} for more information.
	12426
	12427
	12428	@node Where can I find help?
	12429	@section Where can I find help?
	12430
	12431	@quotation
	12432	I'm having trouble using Bison. Where can I find help?
	12433	@end quotation
	12434
	12435	First, read this fine manual. Beyond that, you can send mail to
	12436	@email{help-bison@@gnu.org}. This mailing list is intended to be
	12437	populated with people who are willing to answer questions about using
	12438	and installing Bison. Please keep in mind that (most of) the people on
	12439	the list have aspects of their lives which are not related to Bison (!),
	12440	so you may not receive an answer to your question right away. This can
	12441	be frustrating, but please try not to honk them off; remember that any
	12442	help they provide is purely voluntary and out of the kindness of their
	12443	hearts.
	12444
	12445	@node Bug Reports
	12446	@section Bug Reports
	12447
	12448	@quotation
	12449	I found a bug. What should I include in the bug report?
	12450	@end quotation
	12451
	12452	Before you send a bug report, make sure you are using the latest
	12453	version. Check @url{ftp://ftp.gnu.org/pub/gnu/bison/} or one of its
	12454	mirrors. Be sure to include the version number in your bug report. If
	12455	the bug is present in the latest version but not in a previous version,
	12456	try to determine the most recent version which did not contain the bug.
	12457
	12458	If the bug is parser-related, you should include the smallest grammar
	12459	you can which demonstrates the bug. The grammar file should also be
	12460	complete (i.e., I should be able to run it through Bison without having
	12461	to edit or add anything). The smaller and simpler the grammar, the
	12462	easier it will be to fix the bug.
	12463
	12464	Include information about your compilation environment, including your
	12465	operating system's name and version and your compiler's name and
	12466	version. If you have trouble compiling, you should also include a
	12467	transcript of the build session, starting with the invocation of
	12468	`configure'. Depending on the nature of the bug, you may be asked to
	12469	send additional files as well (such as @file{config.h} or @file{config.cache}).
	12470
	12471	Patches are most welcome, but not required. That is, do not hesitate to
	12472	send a bug report just because you cannot provide a fix.
	12473
	12474	Send bug reports to @email{bug-bison@@gnu.org}.
	12475
	12476	@node More Languages
	12477	@section More Languages
	12478
	12479	@quotation
	12480	Will Bison ever have C++ and Java support? How about @var{insert your
	12481	favorite language here}?
	12482	@end quotation
	12483
	12484	C++ and Java support is there now, and is documented. We'd love to add other
	12485	languages; contributions are welcome.
	12486
	12487	@node Beta Testing
	12488	@section Beta Testing
	12489
	12490	@quotation
	12491	What is involved in being a beta tester?
	12492	@end quotation
	12493
	12494	It's not terribly involved. Basically, you would download a test
	12495	release, compile it, and use it to build and run a parser or two. After
	12496	that, you would submit either a bug report or a message saying that
	12497	everything is okay. It is important to report successes as well as
	12498	failures because test releases eventually become mainstream releases,
	12499	but only if they are adequately tested. If no one tests, development is
	12500	essentially halted.
	12501
	12502	Beta testers are particularly needed for operating systems to which the
	12503	developers do not have easy access. They currently have easy access to
	12504	recent GNU/Linux and Solaris versions. Reports about other operating
	12505	systems are especially welcome.
	12506
	12507	@node Mailing Lists
	12508	@section Mailing Lists
	12509
	12510	@quotation
	12511	How do I join the help-bison and bug-bison mailing lists?
	12512	@end quotation
	12513
	12514	See @url{http://lists.gnu.org/}.
	12515
	12516	@c ================================================= Table of Symbols
	12517
	12518	@node Table of Symbols
	12519	@appendix Bison Symbols
	12520	@cindex Bison symbols, table of
	12521	@cindex symbols in Bison, table of
	12522
	12523	@deffn {Variable} @@$
	12524	In an action, the location of the left-hand side of the rule.
	12525	@xref{Tracking Locations}.
	12526	@end deffn
	12527
	12528	@deffn {Variable} @@@var{n}
	12529	@deffnx {Symbol} @@@var{n}
	12530	In an action, the location of the @var{n}-th symbol of the right-hand side
	12531	of the rule. @xref{Tracking Locations}.
	12532
	12533	In a grammar, the Bison-generated nonterminal symbol for a mid-rule action
	12534	with a semantical value. @xref{Mid-Rule Action Translation}.
	12535	@end deffn
	12536
	12537	@deffn {Variable} @@@var{name}
	12538	@deffnx {Variable} @@[@var{name}]
	12539	In an action, the location of a symbol addressed by @var{name}.
	12540	@xref{Tracking Locations}.
	12541	@end deffn
	12542
	12543	@deffn {Symbol} $@@@var{n}
	12544	In a grammar, the Bison-generated nonterminal symbol for a mid-rule action
	12545	with no semantical value. @xref{Mid-Rule Action Translation}.
	12546	@end deffn
	12547
	12548	@deffn {Variable} $$
	12549	In an action, the semantic value of the left-hand side of the rule.
	12550	@xref{Actions}.
	12551	@end deffn
	12552
	12553	@deffn {Variable} $@var{n}
	12554	In an action, the semantic value of the @var{n}-th symbol of the
	12555	right-hand side of the rule. @xref{Actions}.
	12556	@end deffn
	12557
	12558	@deffn {Variable} $@var{name}
	12559	@deffnx {Variable} $[@var{name}]
	12560	In an action, the semantic value of a symbol addressed by @var{name}.
	12561	@xref{Actions}.
	12562	@end deffn
	12563
	12564	@deffn {Delimiter} %%
	12565	Delimiter used to separate the grammar rule section from the
	12566	Bison declarations section or the epilogue.
	12567	@xref{Grammar Layout, ,The Overall Layout of a Bison Grammar}.
	12568	@end deffn
	12569
	12570	@c Don't insert spaces, or check the DVI output.
	12571	@deffn {Delimiter} %@{@var{code}%@}
	12572	All code listed between @samp{%@{} and @samp{%@}} is copied verbatim
	12573	to the parser implementation file. Such code forms the prologue of
	12574	the grammar file. @xref{Grammar Outline, ,Outline of a Bison
	12575	Grammar}.
	12576	@end deffn
	12577
	12578	@deffn {Directive} %?@{@var{expression}@}
	12579	Predicate actions. This is a type of action clause that may appear in
	12580	rules. The expression is evaluated, and if false, causes a syntax error. In
	12581	GLR parsers during nondeterministic operation,
	12582	this silently causes an alternative parse to die. During deterministic
	12583	operation, it is the same as the effect of YYERROR.
	12584	@xref{Semantic Predicates}.
	12585
	12586	This feature is experimental.
	12587	More user feedback will help to determine whether it should become a permanent
	12588	feature.
	12589	@end deffn
	12590
	12591	@deffn {Construct} /* @dots{} */
	12592	@deffnx {Construct} // @dots{}
	12593	Comments, as in C/C++.
	12594	@end deffn
	12595
	12596	@deffn {Delimiter} :
	12597	Separates a rule's result from its components. @xref{Rules, ,Syntax of
	12598	Grammar Rules}.
	12599	@end deffn
	12600
	12601	@deffn {Delimiter} ;
	12602	Terminates a rule. @xref{Rules, ,Syntax of Grammar Rules}.
	12603	@end deffn
	12604
	12605	@deffn {Delimiter} \|
	12606	Separates alternate rules for the same result nonterminal.
	12607	@xref{Rules, ,Syntax of Grammar Rules}.
	12608	@end deffn
	12609
	12610	@deffn {Directive} <*>
	12611	Used to define a default tagged @code{%destructor} or default tagged
	12612	@code{%printer}.
	12613
	12614	This feature is experimental.
	12615	More user feedback will help to determine whether it should become a permanent
	12616	feature.
	12617
	12618	@xref{Destructor Decl, , Freeing Discarded Symbols}.
	12619	@end deffn
	12620
	12621	@deffn {Directive} <>
	12622	Used to define a default tagless @code{%destructor} or default tagless
	12623	@code{%printer}.
	12624
	12625	This feature is experimental.
	12626	More user feedback will help to determine whether it should become a permanent
	12627	feature.
	12628
	12629	@xref{Destructor Decl, , Freeing Discarded Symbols}.
	12630	@end deffn
	12631
	12632	@deffn {Symbol} $accept
	12633	The predefined nonterminal whose only rule is @samp{$accept: @var{start}
	12634	$end}, where @var{start} is the start symbol. @xref{Start Decl, , The
	12635	Start-Symbol}. It cannot be used in the grammar.
	12636	@end deffn
	12637
	12638	@deffn {Directive} %code @{@var{code}@}
	12639	@deffnx {Directive} %code @var{qualifier} @{@var{code}@}
	12640	Insert @var{code} verbatim into the output parser source at the
	12641	default location or at the location specified by @var{qualifier}.
	12642	@xref{%code Summary}.
	12643	@end deffn
	12644
	12645	@deffn {Directive} %debug
	12646	Equip the parser for debugging. @xref{Decl Summary}.
	12647	@end deffn
	12648
	12649	@ifset defaultprec
	12650	@deffn {Directive} %default-prec
	12651	Assign a precedence to rules that lack an explicit @samp{%prec}
	12652	modifier. @xref{Contextual Precedence, ,Context-Dependent
	12653	Precedence}.
	12654	@end deffn
	12655	@end ifset
	12656
	12657	@deffn {Directive} %define @var{variable}
	12658	@deffnx {Directive} %define @var{variable} @var{value}
	12659	@deffnx {Directive} %define @var{variable} "@var{value}"
	12660	Define a variable to adjust Bison's behavior. @xref{%define Summary}.
	12661	@end deffn
	12662
	12663	@deffn {Directive} %defines
	12664	Bison declaration to create a parser header file, which is usually
	12665	meant for the scanner. @xref{Decl Summary}.
	12666	@end deffn
	12667
	12668	@deffn {Directive} %defines @var{defines-file}
	12669	Same as above, but save in the file @var{defines-file}.
	12670	@xref{Decl Summary}.
	12671	@end deffn
	12672
	12673	@deffn {Directive} %destructor
	12674	Specify how the parser should reclaim the memory associated to
	12675	discarded symbols. @xref{Destructor Decl, , Freeing Discarded Symbols}.
	12676	@end deffn
	12677
	12678	@deffn {Directive} %dprec
	12679	Bison declaration to assign a precedence to a rule that is used at parse
	12680	time to resolve reduce/reduce conflicts. @xref{GLR Parsers, ,Writing
	12681	GLR Parsers}.
	12682	@end deffn
	12683
	12684	@deffn {Directive} %empty
	12685	Bison declaration to declare make explicit that a rule has an empty
	12686	right-hand side. @xref{Empty Rules}.
	12687	@end deffn
	12688
	12689	@deffn {Symbol} $end
	12690	The predefined token marking the end of the token stream. It cannot be
	12691	used in the grammar.
	12692	@end deffn
	12693
	12694	@deffn {Symbol} error
	12695	A token name reserved for error recovery. This token may be used in
	12696	grammar rules so as to allow the Bison parser to recognize an error in
	12697	the grammar without halting the process. In effect, a sentence
	12698	containing an error may be recognized as valid. On a syntax error, the
	12699	token @code{error} becomes the current lookahead token. Actions
	12700	corresponding to @code{error} are then executed, and the lookahead
	12701	token is reset to the token that originally caused the violation.
	12702	@xref{Error Recovery}.
	12703	@end deffn
	12704
	12705	@deffn {Directive} %error-verbose
	12706	An obsolete directive standing for @samp{%define parse.error verbose}
	12707	(@pxref{Error Reporting, ,The Error Reporting Function @code{yyerror}}).
	12708	@end deffn
	12709
	12710	@deffn {Directive} %file-prefix "@var{prefix}"
	12711	Bison declaration to set the prefix of the output files. @xref{Decl
	12712	Summary}.
	12713	@end deffn
	12714
	12715	@deffn {Directive} %glr-parser
	12716	Bison declaration to produce a GLR parser. @xref{GLR
	12717	Parsers, ,Writing GLR Parsers}.
	12718	@end deffn
	12719
	12720	@deffn {Directive} %initial-action
	12721	Run user code before parsing. @xref{Initial Action Decl, , Performing Actions before Parsing}.
	12722	@end deffn
	12723
	12724	@deffn {Directive} %language
	12725	Specify the programming language for the generated parser.
	12726	@xref{Decl Summary}.
	12727	@end deffn
	12728
	12729	@deffn {Directive} %left
	12730	Bison declaration to assign precedence and left associativity to token(s).
	12731	@xref{Precedence Decl, ,Operator Precedence}.
	12732	@end deffn
	12733
	12734	@deffn {Directive} %lex-param @{@var{argument-declaration}@} @dots{}
	12735	Bison declaration to specifying additional arguments that
	12736	@code{yylex} should accept. @xref{Pure Calling,, Calling Conventions
	12737	for Pure Parsers}.
	12738	@end deffn
	12739
	12740	@deffn {Directive} %merge
	12741	Bison declaration to assign a merging function to a rule. If there is a
	12742	reduce/reduce conflict with a rule having the same merging function, the
	12743	function is applied to the two semantic values to get a single result.
	12744	@xref{GLR Parsers, ,Writing GLR Parsers}.
	12745	@end deffn
	12746
	12747	@deffn {Directive} %name-prefix "@var{prefix}"
	12748	Obsoleted by the @code{%define} variable @code{api.prefix} (@pxref{Multiple
	12749	Parsers, ,Multiple Parsers in the Same Program}).
	12750
	12751	Rename the external symbols (variables and functions) used in the parser so
	12752	that they start with @var{prefix} instead of @samp{yy}. Contrary to
	12753	@code{api.prefix}, do no rename types and macros.
	12754
	12755	The precise list of symbols renamed in C parsers is @code{yyparse},
	12756	@code{yylex}, @code{yyerror}, @code{yynerrs}, @code{yylval}, @code{yychar},
	12757	@code{yydebug}, and (if locations are used) @code{yylloc}. If you use a
	12758	push parser, @code{yypush_parse}, @code{yypull_parse}, @code{yypstate},
	12759	@code{yypstate_new} and @code{yypstate_delete} will also be renamed. For
	12760	example, if you use @samp{%name-prefix "c_"}, the names become
	12761	@code{c_parse}, @code{c_lex}, and so on. For C++ parsers, see the
	12762	@code{%define api.namespace} documentation in this section.
	12763	@end deffn
	12764
	12765
	12766	@ifset defaultprec
	12767	@deffn {Directive} %no-default-prec
	12768	Do not assign a precedence to rules that lack an explicit @samp{%prec}
	12769	modifier. @xref{Contextual Precedence, ,Context-Dependent
	12770	Precedence}.
	12771	@end deffn
	12772	@end ifset
	12773
	12774	@deffn {Directive} %no-lines
	12775	Bison declaration to avoid generating @code{#line} directives in the
	12776	parser implementation file. @xref{Decl Summary}.
	12777	@end deffn
	12778
	12779	@deffn {Directive} %nonassoc
	12780	Bison declaration to assign precedence and nonassociativity to token(s).
	12781	@xref{Precedence Decl, ,Operator Precedence}.
	12782	@end deffn
	12783
	12784	@deffn {Directive} %output "@var{file}"
	12785	Bison declaration to set the name of the parser implementation file.
	12786	@xref{Decl Summary}.
	12787	@end deffn
	12788
	12789	@deffn {Directive} %param @{@var{argument-declaration}@} @dots{}
	12790	Bison declaration to specify additional arguments that both
	12791	@code{yylex} and @code{yyparse} should accept. @xref{Parser Function,, The
	12792	Parser Function @code{yyparse}}.
	12793	@end deffn
	12794
	12795	@deffn {Directive} %parse-param @{@var{argument-declaration}@} @dots{}
	12796	Bison declaration to specify additional arguments that @code{yyparse}
	12797	should accept. @xref{Parser Function,, The Parser Function @code{yyparse}}.
	12798	@end deffn
	12799
	12800	@deffn {Directive} %prec
	12801	Bison declaration to assign a precedence to a specific rule.
	12802	@xref{Contextual Precedence, ,Context-Dependent Precedence}.
	12803	@end deffn
	12804
	12805	@deffn {Directive} %precedence
	12806	Bison declaration to assign precedence to token(s), but no associativity
	12807	@xref{Precedence Decl, ,Operator Precedence}.
	12808	@end deffn
	12809
	12810	@deffn {Directive} %pure-parser
	12811	Deprecated version of @samp{%define api.pure} (@pxref{%define
	12812	Summary,,api.pure}), for which Bison is more careful to warn about
	12813	unreasonable usage.
	12814	@end deffn
	12815
	12816	@deffn {Directive} %require "@var{version}"
	12817	Require version @var{version} or higher of Bison. @xref{Require Decl, ,
	12818	Require a Version of Bison}.
	12819	@end deffn
	12820
	12821	@deffn {Directive} %right
	12822	Bison declaration to assign precedence and right associativity to token(s).
	12823	@xref{Precedence Decl, ,Operator Precedence}.
	12824	@end deffn
	12825
	12826	@deffn {Directive} %skeleton
	12827	Specify the skeleton to use; usually for development.
	12828	@xref{Decl Summary}.
	12829	@end deffn
	12830
	12831	@deffn {Directive} %start
	12832	Bison declaration to specify the start symbol. @xref{Start Decl, ,The
	12833	Start-Symbol}.
	12834	@end deffn
	12835
	12836	@deffn {Directive} %token
	12837	Bison declaration to declare token(s) without specifying precedence.
	12838	@xref{Token Decl, ,Token Type Names}.
	12839	@end deffn
	12840
	12841	@deffn {Directive} %token-table
	12842	Bison declaration to include a token name table in the parser
	12843	implementation file. @xref{Decl Summary}.
	12844	@end deffn
	12845
	12846	@deffn {Directive} %type
	12847	Bison declaration to declare nonterminals. @xref{Type Decl,
	12848	,Nonterminal Symbols}.
	12849	@end deffn
	12850
	12851	@deffn {Symbol} $undefined
	12852	The predefined token onto which all undefined values returned by
	12853	@code{yylex} are mapped. It cannot be used in the grammar, rather, use
	12854	@code{error}.
	12855	@end deffn
	12856
	12857	@deffn {Directive} %union
	12858	Bison declaration to specify several possible data types for semantic
	12859	values. @xref{Union Decl, ,The Union Declaration}.
	12860	@end deffn
	12861
	12862	@deffn {Macro} YYABORT
	12863	Macro to pretend that an unrecoverable syntax error has occurred, by
	12864	making @code{yyparse} return 1 immediately. The error reporting
	12865	function @code{yyerror} is not called. @xref{Parser Function, ,The
	12866	Parser Function @code{yyparse}}.
	12867
	12868	For Java parsers, this functionality is invoked using @code{return YYABORT;}
	12869	instead.
	12870	@end deffn
	12871
	12872	@deffn {Macro} YYACCEPT
	12873	Macro to pretend that a complete utterance of the language has been
	12874	read, by making @code{yyparse} return 0 immediately.
	12875	@xref{Parser Function, ,The Parser Function @code{yyparse}}.
	12876
	12877	For Java parsers, this functionality is invoked using @code{return YYACCEPT;}
	12878	instead.
	12879	@end deffn
	12880
	12881	@deffn {Macro} YYBACKUP
	12882	Macro to discard a value from the parser stack and fake a lookahead
	12883	token. @xref{Action Features, ,Special Features for Use in Actions}.
	12884	@end deffn
	12885
	12886	@deffn {Variable} yychar
	12887	External integer variable that contains the integer value of the
	12888	lookahead token. (In a pure parser, it is a local variable within
	12889	@code{yyparse}.) Error-recovery rule actions may examine this variable.
	12890	@xref{Action Features, ,Special Features for Use in Actions}.
	12891	@end deffn
	12892
	12893	@deffn {Variable} yyclearin
	12894	Macro used in error-recovery rule actions. It clears the previous
	12895	lookahead token. @xref{Error Recovery}.
	12896	@end deffn
	12897
	12898	@deffn {Macro} YYDEBUG
	12899	Macro to define to equip the parser with tracing code. @xref{Tracing,
	12900	,Tracing Your Parser}.
	12901	@end deffn
	12902
	12903	@deffn {Variable} yydebug
	12904	External integer variable set to zero by default. If @code{yydebug}
	12905	is given a nonzero value, the parser will output information on input
	12906	symbols and parser action. @xref{Tracing, ,Tracing Your Parser}.
	12907	@end deffn
	12908
	12909	@deffn {Macro} yyerrok
	12910	Macro to cause parser to recover immediately to its normal mode
	12911	after a syntax error. @xref{Error Recovery}.
	12912	@end deffn
	12913
	12914	@deffn {Macro} YYERROR
	12915	Cause an immediate syntax error. This statement initiates error
	12916	recovery just as if the parser itself had detected an error; however, it
	12917	does not call @code{yyerror}, and does not print any message. If you
	12918	want to print an error message, call @code{yyerror} explicitly before
	12919	the @samp{YYERROR;} statement. @xref{Error Recovery}.
	12920
	12921	For Java parsers, this functionality is invoked using @code{return YYERROR;}
	12922	instead.
	12923	@end deffn
	12924
	12925	@deffn {Function} yyerror
	12926	User-supplied function to be called by @code{yyparse} on error.
	12927	@xref{Error Reporting, ,The Error Reporting Function @code{yyerror}}.
	12928	@end deffn
	12929
	12930	@deffn {Macro} YYERROR_VERBOSE
	12931	An obsolete macro used in the @file{yacc.c} skeleton, that you define
	12932	with @code{#define} in the prologue to request verbose, specific error
	12933	message strings when @code{yyerror} is called. It doesn't matter what
	12934	definition you use for @code{YYERROR_VERBOSE}, just whether you define
	12935	it. Using @samp{%define parse.error verbose} is preferred
	12936	(@pxref{Error Reporting, ,The Error Reporting Function @code{yyerror}}).
	12937	@end deffn
	12938
	12939	@deffn {Macro} YYFPRINTF
	12940	Macro used to output run-time traces.
	12941	@xref{Enabling Traces}.
	12942	@end deffn
	12943
	12944	@deffn {Macro} YYINITDEPTH
	12945	Macro for specifying the initial size of the parser stack.
	12946	@xref{Memory Management}.
	12947	@end deffn
	12948
	12949	@deffn {Function} yylex
	12950	User-supplied lexical analyzer function, called with no arguments to get
	12951	the next token. @xref{Lexical, ,The Lexical Analyzer Function
	12952	@code{yylex}}.
	12953	@end deffn
	12954
	12955	@deffn {Variable} yylloc
	12956	External variable in which @code{yylex} should place the line and column
	12957	numbers associated with a token. (In a pure parser, it is a local
	12958	variable within @code{yyparse}, and its address is passed to
	12959	@code{yylex}.)
	12960	You can ignore this variable if you don't use the @samp{@@} feature in the
	12961	grammar actions.
	12962	@xref{Token Locations, ,Textual Locations of Tokens}.
	12963	In semantic actions, it stores the location of the lookahead token.
	12964	@xref{Actions and Locations, ,Actions and Locations}.
	12965	@end deffn
	12966
	12967	@deffn {Type} YYLTYPE
	12968	Data type of @code{yylloc}; by default, a structure with four
	12969	members. @xref{Location Type, , Data Types of Locations}.
	12970	@end deffn
	12971
	12972	@deffn {Variable} yylval
	12973	External variable in which @code{yylex} should place the semantic
	12974	value associated with a token. (In a pure parser, it is a local
	12975	variable within @code{yyparse}, and its address is passed to
	12976	@code{yylex}.)
	12977	@xref{Token Values, ,Semantic Values of Tokens}.
	12978	In semantic actions, it stores the semantic value of the lookahead token.
	12979	@xref{Actions, ,Actions}.
	12980	@end deffn
	12981
	12982	@deffn {Macro} YYMAXDEPTH
	12983	Macro for specifying the maximum size of the parser stack. @xref{Memory
	12984	Management}.
	12985	@end deffn
	12986
	12987	@deffn {Variable} yynerrs
	12988	Global variable which Bison increments each time it reports a syntax error.
	12989	(In a pure parser, it is a local variable within @code{yyparse}. In a
	12990	pure push parser, it is a member of @code{yypstate}.)
	12991	@xref{Error Reporting, ,The Error Reporting Function @code{yyerror}}.
	12992	@end deffn
	12993
	12994	@deffn {Function} yyparse
	12995	The parser function produced by Bison; call this function to start
	12996	parsing. @xref{Parser Function, ,The Parser Function @code{yyparse}}.
	12997	@end deffn
	12998
	12999	@deffn {Macro} YYPRINT
	13000	Macro used to output token semantic values. For @file{yacc.c} only.
	13001	Obsoleted by @code{%printer}.
	13002	@xref{The YYPRINT Macro, , The @code{YYPRINT} Macro}.
	13003	@end deffn
	13004
	13005	@deffn {Function} yypstate_delete
	13006	The function to delete a parser instance, produced by Bison in push mode;
	13007	call this function to delete the memory associated with a parser.
	13008	@xref{Parser Delete Function, ,The Parser Delete Function
	13009	@code{yypstate_delete}}.
	13010	(The current push parsing interface is experimental and may evolve.
	13011	More user feedback will help to stabilize it.)
	13012	@end deffn
	13013
	13014	@deffn {Function} yypstate_new
	13015	The function to create a parser instance, produced by Bison in push mode;
	13016	call this function to create a new parser.
	13017	@xref{Parser Create Function, ,The Parser Create Function
	13018	@code{yypstate_new}}.
	13019	(The current push parsing interface is experimental and may evolve.
	13020	More user feedback will help to stabilize it.)
	13021	@end deffn
	13022
	13023	@deffn {Function} yypull_parse
	13024	The parser function produced by Bison in push mode; call this function to
	13025	parse the rest of the input stream.
	13026	@xref{Pull Parser Function, ,The Pull Parser Function
	13027	@code{yypull_parse}}.
	13028	(The current push parsing interface is experimental and may evolve.
	13029	More user feedback will help to stabilize it.)
	13030	@end deffn
	13031
	13032	@deffn {Function} yypush_parse
	13033	The parser function produced by Bison in push mode; call this function to
	13034	parse a single token. @xref{Push Parser Function, ,The Push Parser Function
	13035	@code{yypush_parse}}.
	13036	(The current push parsing interface is experimental and may evolve.
	13037	More user feedback will help to stabilize it.)
	13038	@end deffn
	13039
	13040	@deffn {Macro} YYRECOVERING
	13041	The expression @code{YYRECOVERING ()} yields 1 when the parser
	13042	is recovering from a syntax error, and 0 otherwise.
	13043	@xref{Action Features, ,Special Features for Use in Actions}.
	13044	@end deffn
	13045
	13046	@deffn {Macro} YYSTACK_USE_ALLOCA
	13047	Macro used to control the use of @code{alloca} when the
	13048	deterministic parser in C needs to extend its stacks. If defined to 0,
	13049	the parser will use @code{malloc} to extend its stacks. If defined to
	13050	1, the parser will use @code{alloca}. Values other than 0 and 1 are
	13051	reserved for future Bison extensions. If not defined,
	13052	@code{YYSTACK_USE_ALLOCA} defaults to 0.
	13053
	13054	In the all-too-common case where your code may run on a host with a
	13055	limited stack and with unreliable stack-overflow checking, you should
	13056	set @code{YYMAXDEPTH} to a value that cannot possibly result in
	13057	unchecked stack overflow on any of your target hosts when
	13058	@code{alloca} is called. You can inspect the code that Bison
	13059	generates in order to determine the proper numeric values. This will
	13060	require some expertise in low-level implementation details.
	13061	@end deffn
	13062
	13063	@deffn {Type} YYSTYPE
	13064	Deprecated in favor of the @code{%define} variable @code{api.value.type}.
	13065	Data type of semantic values; @code{int} by default.
	13066	@xref{Value Type, ,Data Types of Semantic Values}.
	13067	@end deffn
	13068
	13069	@node Glossary
	13070	@appendix Glossary
	13071	@cindex glossary
	13072
	13073	@table @asis
	13074	@item Accepting state
	13075	A state whose only action is the accept action.
	13076	The accepting state is thus a consistent state.
	13077	@xref{Understanding, ,Understanding Your Parser}.
	13078
	13079	@item Backus-Naur Form (BNF; also called ``Backus Normal Form'')
	13080	Formal method of specifying context-free grammars originally proposed
	13081	by John Backus, and slightly improved by Peter Naur in his 1960-01-02
	13082	committee document contributing to what became the Algol 60 report.
	13083	@xref{Language and Grammar, ,Languages and Context-Free Grammars}.
	13084
	13085	@item Consistent state
	13086	A state containing only one possible action. @xref{Default Reductions}.
	13087
	13088	@item Context-free grammars
	13089	Grammars specified as rules that can be applied regardless of context.
	13090	Thus, if there is a rule which says that an integer can be used as an
	13091	expression, integers are allowed @emph{anywhere} an expression is
	13092	permitted. @xref{Language and Grammar, ,Languages and Context-Free
	13093	Grammars}.
	13094
	13095	@item Default reduction
	13096	The reduction that a parser should perform if the current parser state
	13097	contains no other action for the lookahead token. In permitted parser
	13098	states, Bison declares the reduction with the largest lookahead set to be
	13099	the default reduction and removes that lookahead set. @xref{Default
	13100	Reductions}.
	13101
	13102	@item Defaulted state
	13103	A consistent state with a default reduction. @xref{Default Reductions}.
	13104
	13105	@item Dynamic allocation
	13106	Allocation of memory that occurs during execution, rather than at
	13107	compile time or on entry to a function.
	13108
	13109	@item Empty string
	13110	Analogous to the empty set in set theory, the empty string is a
	13111	character string of length zero.
	13112
	13113	@item Finite-state stack machine
	13114	A ``machine'' that has discrete states in which it is said to exist at
	13115	each instant in time. As input to the machine is processed, the
	13116	machine moves from state to state as specified by the logic of the
	13117	machine. In the case of the parser, the input is the language being
	13118	parsed, and the states correspond to various stages in the grammar
	13119	rules. @xref{Algorithm, ,The Bison Parser Algorithm}.
	13120
	13121	@item Generalized LR (GLR)
	13122	A parsing algorithm that can handle all context-free grammars, including those
	13123	that are not LR(1). It resolves situations that Bison's
	13124	deterministic parsing
	13125	algorithm cannot by effectively splitting off multiple parsers, trying all
	13126	possible parsers, and discarding those that fail in the light of additional
	13127	right context. @xref{Generalized LR Parsing, ,Generalized
	13128	LR Parsing}.
	13129
	13130	@item Grouping
	13131	A language construct that is (in general) grammatically divisible;
	13132	for example, `expression' or `declaration' in C@.
	13133	@xref{Language and Grammar, ,Languages and Context-Free Grammars}.
	13134
	13135	@item IELR(1) (Inadequacy Elimination LR(1))
	13136	A minimal LR(1) parser table construction algorithm. That is, given any
	13137	context-free grammar, IELR(1) generates parser tables with the full
	13138	language-recognition power of canonical LR(1) but with nearly the same
	13139	number of parser states as LALR(1). This reduction in parser states is
	13140	often an order of magnitude. More importantly, because canonical LR(1)'s
	13141	extra parser states may contain duplicate conflicts in the case of non-LR(1)
	13142	grammars, the number of conflicts for IELR(1) is often an order of magnitude
	13143	less as well. This can significantly reduce the complexity of developing a
	13144	grammar. @xref{LR Table Construction}.
	13145
	13146	@item Infix operator
	13147	An arithmetic operator that is placed between the operands on which it
	13148	performs some operation.
	13149
	13150	@item Input stream
	13151	A continuous flow of data between devices or programs.
	13152
	13153	@item LAC (Lookahead Correction)
	13154	A parsing mechanism that fixes the problem of delayed syntax error
	13155	detection, which is caused by LR state merging, default reductions, and the
	13156	use of @code{%nonassoc}. Delayed syntax error detection results in
	13157	unexpected semantic actions, initiation of error recovery in the wrong
	13158	syntactic context, and an incorrect list of expected tokens in a verbose
	13159	syntax error message. @xref{LAC}.
	13160
	13161	@item Language construct
	13162	One of the typical usage schemas of the language. For example, one of
	13163	the constructs of the C language is the @code{if} statement.
	13164	@xref{Language and Grammar, ,Languages and Context-Free Grammars}.
	13165
	13166	@item Left associativity
	13167	Operators having left associativity are analyzed from left to right:
	13168	@samp{a+b+c} first computes @samp{a+b} and then combines with
	13169	@samp{c}. @xref{Precedence, ,Operator Precedence}.
	13170
	13171	@item Left recursion
	13172	A rule whose result symbol is also its first component symbol; for
	13173	example, @samp{expseq1 : expseq1 ',' exp;}. @xref{Recursion, ,Recursive
	13174	Rules}.
	13175
	13176	@item Left-to-right parsing
	13177	Parsing a sentence of a language by analyzing it token by token from
	13178	left to right. @xref{Algorithm, ,The Bison Parser Algorithm}.
	13179
	13180	@item Lexical analyzer (scanner)
	13181	A function that reads an input stream and returns tokens one by one.
	13182	@xref{Lexical, ,The Lexical Analyzer Function @code{yylex}}.
	13183
	13184	@item Lexical tie-in
	13185	A flag, set by actions in the grammar rules, which alters the way
	13186	tokens are parsed. @xref{Lexical Tie-ins}.
	13187
	13188	@item Literal string token
	13189	A token which consists of two or more fixed characters. @xref{Symbols}.
	13190
	13191	@item Lookahead token
	13192	A token already read but not yet shifted. @xref{Lookahead, ,Lookahead
	13193	Tokens}.
	13194
	13195	@item LALR(1)
	13196	The class of context-free grammars that Bison (like most other parser
	13197	generators) can handle by default; a subset of LR(1).
	13198	@xref{Mysterious Conflicts}.
	13199
	13200	@item LR(1)
	13201	The class of context-free grammars in which at most one token of
	13202	lookahead is needed to disambiguate the parsing of any piece of input.
	13203
	13204	@item Nonterminal symbol
	13205	A grammar symbol standing for a grammatical construct that can
	13206	be expressed through rules in terms of smaller constructs; in other
	13207	words, a construct that is not a token. @xref{Symbols}.
	13208
	13209	@item Parser
	13210	A function that recognizes valid sentences of a language by analyzing
	13211	the syntax structure of a set of tokens passed to it from a lexical
	13212	analyzer.
	13213
	13214	@item Postfix operator
	13215	An arithmetic operator that is placed after the operands upon which it
	13216	performs some operation.
	13217
	13218	@item Reduction
	13219	Replacing a string of nonterminals and/or terminals with a single
	13220	nonterminal, according to a grammar rule. @xref{Algorithm, ,The Bison
	13221	Parser Algorithm}.
	13222
	13223	@item Reentrant
	13224	A reentrant subprogram is a subprogram which can be in invoked any
	13225	number of times in parallel, without interference between the various
	13226	invocations. @xref{Pure Decl, ,A Pure (Reentrant) Parser}.
	13227
	13228	@item Reverse polish notation
	13229	A language in which all operators are postfix operators.
	13230
	13231	@item Right recursion
	13232	A rule whose result symbol is also its last component symbol; for
	13233	example, @samp{expseq1: exp ',' expseq1;}. @xref{Recursion, ,Recursive
	13234	Rules}.
	13235
	13236	@item Semantics
	13237	In computer languages, the semantics are specified by the actions
	13238	taken for each instance of the language, i.e., the meaning of
	13239	each statement. @xref{Semantics, ,Defining Language Semantics}.
	13240
	13241	@item Shift
	13242	A parser is said to shift when it makes the choice of analyzing
	13243	further input from the stream rather than reducing immediately some
	13244	already-recognized rule. @xref{Algorithm, ,The Bison Parser Algorithm}.
	13245
	13246	@item Single-character literal
	13247	A single character that is recognized and interpreted as is.
	13248	@xref{Grammar in Bison, ,From Formal Rules to Bison Input}.
	13249
	13250	@item Start symbol
	13251	The nonterminal symbol that stands for a complete valid utterance in
	13252	the language being parsed. The start symbol is usually listed as the
	13253	first nonterminal symbol in a language specification.
	13254	@xref{Start Decl, ,The Start-Symbol}.
	13255
	13256	@item Symbol table
	13257	A data structure where symbol names and associated data are stored
	13258	during parsing to allow for recognition and use of existing
	13259	information in repeated uses of a symbol. @xref{Multi-function Calc}.
	13260
	13261	@item Syntax error
	13262	An error encountered during parsing of an input stream due to invalid
	13263	syntax. @xref{Error Recovery}.
	13264
	13265	@item Token
	13266	A basic, grammatically indivisible unit of a language. The symbol
	13267	that describes a token in the grammar is a terminal symbol.
	13268	The input of the Bison parser is a stream of tokens which comes from
	13269	the lexical analyzer. @xref{Symbols}.
	13270
	13271	@item Terminal symbol
	13272	A grammar symbol that has no rules in the grammar and therefore is
	13273	grammatically indivisible. The piece of text it represents is a token.
	13274	@xref{Language and Grammar, ,Languages and Context-Free Grammars}.
	13275
	13276	@item Unreachable state
	13277	A parser state to which there does not exist a sequence of transitions from
	13278	the parser's start state. A state can become unreachable during conflict
	13279	resolution. @xref{Unreachable States}.
	13280	@end table
	13281
	13282	@node Copying This Manual
	13283	@appendix Copying This Manual
	13284	@include fdl.texi
	13285
	13286	@node Bibliography
	13287	@unnumbered Bibliography
	13288
	13289	@table @asis
	13290	@item [Denny 2008]
	13291	Joel E. Denny and Brian A. Malloy, IELR(1): Practical LR(1) Parser Tables
	13292	for Non-LR(1) Grammars with Conflict Resolution, in @cite{Proceedings of the
	13293	2008 ACM Symposium on Applied Computing} (SAC'08), ACM, New York, NY, USA,
	13294	pp.@: 240--245. @uref{http://dx.doi.org/10.1145/1363686.1363747}
	13295
	13296	@item [Denny 2010 May]
	13297	Joel E. Denny, PSLR(1): Pseudo-Scannerless Minimal LR(1) for the
	13298	Deterministic Parsing of Composite Languages, Ph.D. Dissertation, Clemson
	13299	University, Clemson, SC, USA (May 2010).
	13300	@uref{http://proquest.umi.com/pqdlink?did=2041473591&Fmt=7&clientId=79356&RQT=309&VName=PQD}
	13301
	13302	@item [Denny 2010 November]
	13303	Joel E. Denny and Brian A. Malloy, The IELR(1) Algorithm for Generating
	13304	Minimal LR(1) Parser Tables for Non-LR(1) Grammars with Conflict Resolution,
	13305	in @cite{Science of Computer Programming}, Vol.@: 75, Issue 11 (November
	13306	2010), pp.@: 943--979. @uref{http://dx.doi.org/10.1016/j.scico.2009.08.001}
	13307
	13308	@item [DeRemer 1982]
	13309	Frank DeRemer and Thomas Pennello, Efficient Computation of LALR(1)
	13310	Look-Ahead Sets, in @cite{ACM Transactions on Programming Languages and
	13311	Systems}, Vol.@: 4, No.@: 4 (October 1982), pp.@:
	13312	615--649. @uref{http://dx.doi.org/10.1145/69622.357187}
	13313
	13314	@item [Knuth 1965]
	13315	Donald E. Knuth, On the Translation of Languages from Left to Right, in
	13316	@cite{Information and Control}, Vol.@: 8, Issue 6 (December 1965), pp.@:
	13317	607--639. @uref{http://dx.doi.org/10.1016/S0019-9958(65)90426-2}
	13318
	13319	@item [Scott 2000]
	13320	Elizabeth Scott, Adrian Johnstone, and Shamsa Sadaf Hussain,
	13321	@cite{Tomita-Style Generalised LR Parsers}, Royal Holloway, University of
	13322	London, Department of Computer Science, TR-00-12 (December 2000).
	13323	@uref{http://www.cs.rhul.ac.uk/research/languages/publications/tomita_style_1.ps}
	13324	@end table
	13325
	13326	@node Index of Terms
	13327	@unnumbered Index of Terms
	13328
	13329	@printindex cp
	13330
	13331	@bye
	13332
	13333	@c LocalWords: texinfo setfilename settitle setchapternewpage finalout texi FSF
	13334	@c LocalWords: ifinfo smallbook shorttitlepage titlepage GPL FIXME iftex FSF's
	13335	@c LocalWords: akim fn cp syncodeindex vr tp synindex dircategory direntry Naur
	13336	@c LocalWords: ifset vskip pt filll insertcopying sp ISBN Etienne Suvasa Multi
	13337	@c LocalWords: ifnottex yyparse detailmenu GLR RPN Calc var Decls Rpcalc multi
	13338	@c LocalWords: rpcalc Lexer Expr ltcalc mfcalc yylex defaultprec Donnelly Gotos
	13339	@c LocalWords: yyerror pxref LR yylval cindex dfn LALR samp gpl BNF xref yypush
	13340	@c LocalWords: const int paren ifnotinfo AC noindent emph expr stmt findex lr
	13341	@c LocalWords: glr YYSTYPE TYPENAME prog dprec printf decl init stmtMerge POSIX
	13342	@c LocalWords: pre STDC GNUC endif yy YY alloca lf stddef stdlib YYDEBUG yypull
	13343	@c LocalWords: NUM exp subsubsection kbd Ctrl ctype EOF getchar isdigit nonfree
	13344	@c LocalWords: ungetc stdin scanf sc calc ulator ls lm cc NEG prec yyerrok rr
	13345	@c LocalWords: longjmp fprintf stderr yylloc YYLTYPE cos ln Stallman Destructor
	13346	@c LocalWords: symrec val tptr FNCT fnctptr func struct sym enum IEC syntaxes
	13347	@c LocalWords: fnct putsym getsym fname arith fncts atan ptr malloc sizeof Lex
	13348	@c LocalWords: strlen strcpy fctn strcmp isalpha symbuf realloc isalnum DOTDOT
	13349	@c LocalWords: ptypes itype YYPRINT trigraphs yytname expseq vindex dtype Unary
	13350	@c LocalWords: Rhs YYRHSLOC LE nonassoc op deffn typeless yynerrs nonterminal
	13351	@c LocalWords: yychar yydebug msg YYNTOKENS YYNNTS YYNRULES YYNSTATES reentrant
	13352	@c LocalWords: cparse clex deftypefun NE defmac YYACCEPT YYABORT param yypstate
	13353	@c LocalWords: strncmp intval tindex lvalp locp llocp typealt YYBACKUP subrange
	13354	@c LocalWords: YYEMPTY YYEOF YYRECOVERING yyclearin GE def UMINUS maybeword loc
	13355	@c LocalWords: Johnstone Shamsa Sadaf Hussain Tomita TR uref YYMAXDEPTH inline
	13356	@c LocalWords: YYINITDEPTH stmts ref initdcl maybeasm notype Lookahead yyoutput
	13357	@c LocalWords: hexflag STR exdent itemset asis DYYDEBUG YYFPRINTF args Autoconf
	13358	@c LocalWords: infile ypp yxx outfile itemx tex leaderfill Troubleshouting sqrt
	13359	@c LocalWords: hbox hss hfill tt ly yyin fopen fclose ofirst gcc ll lookahead
	13360	@c LocalWords: nbar yytext fst snd osplit ntwo strdup AST Troublereporting th
	13361	@c LocalWords: YYSTACK DVI fdl printindex IELR nondeterministic nonterminals ps
	13362	@c LocalWords: subexpressions declarator nondeferred config libintl postfix LAC
	13363	@c LocalWords: preprocessor nonpositive unary nonnumeric typedef extern rhs sr
	13364	@c LocalWords: yytokentype destructor multicharacter nonnull EBCDIC nterm LR's
	13365	@c LocalWords: lvalue nonnegative XNUM CHR chr TAGLESS tagless stdout api TOK
	13366	@c LocalWords: destructors Reentrancy nonreentrant subgrammar nonassociative Ph
	13367	@c LocalWords: deffnx namespace xml goto lalr ielr runtime lex yacc yyps env
	13368	@c LocalWords: yystate variadic Unshift NLS gettext po UTF Automake LOCALEDIR
	13369	@c LocalWords: YYENABLE bindtextdomain Makefile DEFS CPPFLAGS DBISON DeRemer
	13370	@c LocalWords: autoreconf Pennello multisets nondeterminism Generalised baz ACM
	13371	@c LocalWords: redeclare automata Dparse localedir datadir XSLT midrule Wno
	13372	@c LocalWords: Graphviz multitable headitem hh basename Doxygen fno filename
	13373	@c LocalWords: doxygen ival sval deftypemethod deallocate pos deftypemethodx
	13374	@c LocalWords: Ctor defcv defcvx arg accessors arithmetics CPP ifndef CALCXX
	13375	@c LocalWords: lexer's calcxx bool LPAREN RPAREN deallocation cerrno climits
	13376	@c LocalWords: cstdlib Debian undef yywrap unput noyywrap nounput zA yyleng
	13377	@c LocalWords: errno strtol ERANGE str strerror iostream argc argv Javadoc PSLR
	13378	@c LocalWords: bytecode initializers superclass stype ASTNode autoboxing nls
	13379	@c LocalWords: toString deftypeivar deftypeivarx deftypeop YYParser strictfp
	13380	@c LocalWords: superclasses boolean getErrorVerbose setErrorVerbose deftypecv
	13381	@c LocalWords: getDebugStream setDebugStream getDebugLevel setDebugLevel url
	13382	@c LocalWords: bisonVersion deftypecvx bisonSkeleton getStartPos getEndPos uint
	13383	@c LocalWords: getLVal defvar deftypefn deftypefnx gotos msgfmt Corbett LALR's
	13384	@c LocalWords: subdirectory Solaris nonassociativity perror schemas Malloy ints
	13385	@c LocalWords: Scannerless ispell american ChangeLog smallexample CSTYPE CLTYPE
	13386	@c LocalWords: clval CDEBUG cdebug deftypeopx yyterminate LocationType
	13387	@c LocalWords: parsers parser's
	13388	@c LocalWords: associativity subclasses precedences unresolvable runnable
	13389	@c LocalWords: allocators subunit initializations unreferenced untyped
	13390	@c LocalWords: errorVerbose subtype subtypes
	13391
	13392	@c Local Variables:
	13393	@c ispell-dictionary: "american"
	13394	@c fill-column: 76
	13395	@c End: