X-Git-Url: https://git.saurik.com/bison.git/blobdiff_plain/e254a580b550c8cbaff1709527cd896d972df010..2bde91138d696cd5b05eeea61ebf709695ca695e:/doc/bison.texinfo?ds=inline diff --git a/doc/bison.texinfo b/doc/bison.texinfo index c6fc0a5f..302bc4a1 100644 --- a/doc/bison.texinfo +++ b/doc/bison.texinfo @@ -89,76 +89,76 @@ Cover art by Etienne Suvasa. @menu * Introduction:: * Conditions:: -* Copying:: The @acronym{GNU} General Public License says - how you can copy and share Bison +* Copying:: The @acronym{GNU} General Public License says + how you can copy and share Bison. Tutorial sections: -* Concepts:: Basic concepts for understanding Bison. -* Examples:: Three simple explained examples of using Bison. +* Concepts:: Basic concepts for understanding Bison. +* Examples:: Three simple explained examples of using Bison. Reference sections: -* Grammar File:: Writing Bison declarations and rules. -* Interface:: C-language interface to the parser function @code{yyparse}. -* Algorithm:: How the Bison parser works at run-time. -* Error Recovery:: Writing rules for error recovery. +* Grammar File:: Writing Bison declarations and rules. +* Interface:: C-language interface to the parser function @code{yyparse}. +* Algorithm:: How the Bison parser works at run-time. +* Error Recovery:: Writing rules for error recovery. * Context Dependency:: What to do if your language syntax is too - messy for Bison to handle straightforwardly. -* Debugging:: Understanding or debugging Bison parsers. -* Invocation:: How to run Bison (to produce the parser source file). -* Other Languages:: Creating C++ and Java parsers. -* FAQ:: Frequently Asked Questions -* Table of Symbols:: All the keywords of the Bison language are explained. -* Glossary:: Basic concepts are explained. -* Copying This Manual:: License for copying this manual. -* Index:: Cross-references to the text. + messy for Bison to handle straightforwardly. +* Debugging:: Understanding or debugging Bison parsers. +* Invocation:: How to run Bison (to produce the parser source file). +* Other Languages:: Creating C++ and Java parsers. +* FAQ:: Frequently Asked Questions +* Table of Symbols:: All the keywords of the Bison language are explained. +* Glossary:: Basic concepts are explained. +* Copying This Manual:: License for copying this manual. +* Index:: Cross-references to the text. @detailmenu --- The Detailed Node Listing --- The Concepts of Bison -* Language and Grammar:: Languages and context-free grammars, - as mathematical ideas. -* Grammar in Bison:: How we represent grammars for Bison's sake. -* Semantic Values:: Each token or syntactic grouping can have - a semantic value (the value of an integer, - the name of an identifier, etc.). -* Semantic Actions:: Each rule can have an action containing C code. -* GLR Parsers:: Writing parsers for general context-free languages. -* Locations Overview:: Tracking Locations. -* Bison Parser:: What are Bison's input and output, - how is the output used? -* Stages:: Stages in writing and running Bison grammars. -* Grammar Layout:: Overall structure of a Bison grammar file. +* Language and Grammar:: Languages and context-free grammars, + as mathematical ideas. +* Grammar in Bison:: How we represent grammars for Bison's sake. +* Semantic Values:: Each token or syntactic grouping can have + a semantic value (the value of an integer, + the name of an identifier, etc.). +* Semantic Actions:: Each rule can have an action containing C code. +* GLR Parsers:: Writing parsers for general context-free languages. +* Locations Overview:: Tracking Locations. +* Bison Parser:: What are Bison's input and output, + how is the output used? +* Stages:: Stages in writing and running Bison grammars. +* Grammar Layout:: Overall structure of a Bison grammar file. Writing @acronym{GLR} Parsers -* Simple GLR Parsers:: Using @acronym{GLR} parsers on unambiguous grammars. -* Merging GLR Parses:: Using @acronym{GLR} parsers to resolve ambiguities. -* GLR Semantic Actions:: Deferred semantic actions have special concerns. -* Compiler Requirements:: @acronym{GLR} parsers require a modern C compiler. +* Simple GLR Parsers:: Using @acronym{GLR} parsers on unambiguous grammars. +* Merging GLR Parses:: Using @acronym{GLR} parsers to resolve ambiguities. +* GLR Semantic Actions:: Deferred semantic actions have special concerns. +* Compiler Requirements:: @acronym{GLR} parsers require a modern C compiler. Examples -* RPN Calc:: Reverse polish notation calculator; - a first example with no operator precedence. -* Infix Calc:: Infix (algebraic) notation calculator. - Operator precedence is introduced. +* RPN Calc:: Reverse polish notation calculator; + a first example with no operator precedence. +* Infix Calc:: Infix (algebraic) notation calculator. + Operator precedence is introduced. * Simple Error Recovery:: Continuing after syntax errors. * Location Tracking Calc:: Demonstrating the use of @@@var{n} and @@$. -* Multi-function Calc:: Calculator with memory and trig functions. - It uses multiple data-types for semantic values. -* Exercises:: Ideas for improving the multi-function calculator. +* Multi-function Calc:: Calculator with memory and trig functions. + It uses multiple data-types for semantic values. +* Exercises:: Ideas for improving the multi-function calculator. Reverse Polish Notation Calculator -* Decls: Rpcalc Decls. Prologue (declarations) for rpcalc. -* Rules: Rpcalc Rules. Grammar Rules for rpcalc, with explanation. -* Lexer: Rpcalc Lexer. The lexical analyzer. -* Main: Rpcalc Main. The controlling function. -* Error: Rpcalc Error. The error reporting function. -* Gen: Rpcalc Gen. Running Bison on the grammar file. -* Comp: Rpcalc Compile. Run the C compiler on the output code. +* Rpcalc Declarations:: Prologue (declarations) for rpcalc. +* Rpcalc Rules:: Grammar Rules for rpcalc, with explanation. +* Rpcalc Lexer:: The lexical analyzer. +* Rpcalc Main:: The controlling function. +* Rpcalc Error:: The error reporting function. +* Rpcalc Generate:: Running Bison on the grammar file. +* Rpcalc Compile:: Run the C compiler on the output code. Grammar Rules for @code{rpcalc} @@ -168,15 +168,15 @@ Grammar Rules for @code{rpcalc} Location Tracking Calculator: @code{ltcalc} -* Decls: Ltcalc Decls. Bison and C declarations for ltcalc. -* Rules: Ltcalc Rules. Grammar rules for ltcalc, with explanations. -* Lexer: Ltcalc Lexer. The lexical analyzer. +* Ltcalc Declarations:: Bison and C declarations for ltcalc. +* Ltcalc Rules:: Grammar rules for ltcalc, with explanations. +* Ltcalc Lexer:: The lexical analyzer. Multi-Function Calculator: @code{mfcalc} -* Decl: Mfcalc Decl. Bison declarations for multi-function calculator. -* Rules: Mfcalc Rules. Grammar rules for the calculator. -* Symtab: Mfcalc Symtab. Symbol table management subroutines. +* Mfcalc Declarations:: Bison declarations for multi-function calculator. +* Mfcalc Rules:: Grammar rules for the calculator. +* Mfcalc Symbol Table:: Symbol table management subroutines. Bison Grammar Files @@ -191,11 +191,11 @@ Bison Grammar Files Outline of a Bison Grammar -* Prologue:: Syntax and usage of the prologue. +* Prologue:: Syntax and usage of the prologue. * Prologue Alternatives:: Syntax and usage of alternatives to the prologue. -* Bison Declarations:: Syntax and usage of the Bison declarations section. -* Grammar Rules:: Syntax and usage of the grammar rules section. -* Epilogue:: Syntax and usage of the epilogue. +* Bison Declarations:: Syntax and usage of the Bison declarations section. +* Grammar Rules:: Syntax and usage of the grammar rules section. +* Epilogue:: Syntax and usage of the epilogue. Defining Language Semantics @@ -230,24 +230,28 @@ Bison Declarations Parser C-Language Interface -* Parser Function:: How to call @code{yyparse} and what it returns. -* Lexical:: You must supply a function @code{yylex} - which reads tokens. -* Error Reporting:: You must supply a function @code{yyerror}. -* Action Features:: Special features for use in actions. -* Internationalization:: How to let the parser speak in the user's - native language. +* Parser Function:: How to call @code{yyparse} and what it returns. +* Push Parser Function:: How to call @code{yypush_parse} and what it returns. +* Pull Parser Function:: How to call @code{yypull_parse} and what it returns. +* Parser Create Function:: How to call @code{yypstate_new} and what it returns. +* Parser Delete Function:: How to call @code{yypstate_delete} and what it returns. +* Lexical:: You must supply a function @code{yylex} + which reads tokens. +* Error Reporting:: You must supply a function @code{yyerror}. +* Action Features:: Special features for use in actions. +* Internationalization:: How to let the parser speak in the user's + native language. The Lexical Analyzer Function @code{yylex} * Calling Convention:: How @code{yyparse} calls @code{yylex}. -* Token Values:: How @code{yylex} must return the semantic value - of the token it has read. -* Token Locations:: How @code{yylex} must return the text location - (line number, etc.) of the token, if the - actions want that. -* Pure Calling:: How the calling convention differs - in a pure parser (@pxref{Pure Decl, ,A Pure (Reentrant) Parser}). +* Token Values:: How @code{yylex} must return the semantic value + of the token it has read. +* Token Locations:: How @code{yylex} must return the text location + (line number, etc.) of the token, if the + actions want that. +* Pure Calling:: How the calling convention differs in a pure parser + (@pxref{Pure Decl, ,A Pure (Reentrant) Parser}). The Bison Parser Algorithm @@ -257,14 +261,15 @@ The Bison Parser Algorithm * Contextual Precedence:: When an operator's precedence depends on context. * Parser States:: The parser is a finite-state-machine with stack. * Reduce/Reduce:: When two rules are applicable in the same situation. -* Mystery Conflicts:: Reduce/reduce conflicts that look unjustified. +* Mystery Conflicts:: Reduce/reduce conflicts that look unjustified. * Generalized LR Parsing:: Parsing arbitrary context-free grammars. * Memory Management:: What happens when memory is exhausted. How to avoid it. Operator Precedence * Why Precedence:: An example showing why precedence is needed. -* Using Precedence:: How to specify precedence in Bison grammars. +* Using Precedence:: How to specify precedence and associativity. +* Precedence Only:: How to specify precedence only. * Precedence Examples:: How these features are used in the previous example. * How Precedence:: How they work. @@ -311,33 +316,33 @@ A Complete C++ Example Java Parsers -* Java Bison Interface:: Asking for Java parser generation -* Java Semantic Values:: %type and %token vs. Java -* Java Location Values:: The position and location classes -* Java Parser Interface:: Instantiating and running the parser -* Java Scanner Interface:: Specifying the scanner for the parser -* Java Action Features:: Special features for use in actions. -* Java Differences:: Differences between C/C++ and Java Grammars -* Java Declarations Summary:: List of Bison declarations used with Java +* Java Bison Interface:: Asking for Java parser generation +* Java Semantic Values:: %type and %token vs. Java +* Java Location Values:: The position and location classes +* Java Parser Interface:: Instantiating and running the parser +* Java Scanner Interface:: Specifying the scanner for the parser +* Java Action Features:: Special features for use in actions +* Java Differences:: Differences between C/C++ and Java Grammars +* Java Declarations Summary:: List of Bison declarations used with Java Frequently Asked Questions -* Memory Exhausted:: Breaking the Stack Limits -* How Can I Reset the Parser:: @code{yyparse} Keeps some State -* Strings are Destroyed:: @code{yylval} Loses Track of Strings -* Implementing Gotos/Loops:: Control Flow in the Calculator -* Multiple start-symbols:: Factoring closely related grammars -* Secure? Conform?:: Is Bison @acronym{POSIX} safe? -* I can't build Bison:: Troubleshooting -* Where can I find help?:: Troubleshouting -* Bug Reports:: Troublereporting -* Other Languages:: Parsers in Java and others -* Beta Testing:: Experimenting development versions -* Mailing Lists:: Meeting other Bison users +* Memory Exhausted:: Breaking the Stack Limits +* How Can I Reset the Parser:: @code{yyparse} Keeps some State +* Strings are Destroyed:: @code{yylval} Loses Track of Strings +* Implementing Gotos/Loops:: Control Flow in the Calculator +* Multiple start-symbols:: Factoring closely related grammars +* Secure? Conform?:: Is Bison @acronym{POSIX} safe? +* I can't build Bison:: Troubleshooting +* Where can I find help?:: Troubleshouting +* Bug Reports:: Troublereporting +* More Languages:: Parsers in C++, Java, and so on +* Beta Testing:: Experimenting development versions +* Mailing Lists:: Meeting other Bison users Copying This Manual -* Copying This Manual:: License for copying this manual. +* Copying This Manual:: License for copying this manual. @end detailmenu @end menu @@ -417,19 +422,19 @@ details of Bison will not make sense. If you do not already know how to use Bison or Yacc, we suggest you start by reading this chapter carefully. @menu -* Language and Grammar:: Languages and context-free grammars, - as mathematical ideas. -* Grammar in Bison:: How we represent grammars for Bison's sake. -* Semantic Values:: Each token or syntactic grouping can have - a semantic value (the value of an integer, - the name of an identifier, etc.). -* Semantic Actions:: Each rule can have an action containing C code. -* GLR Parsers:: Writing parsers for general context-free languages. -* Locations Overview:: Tracking Locations. -* Bison Parser:: What are Bison's input and output, - how is the output used? -* Stages:: Stages in writing and running Bison grammars. -* Grammar Layout:: Overall structure of a Bison grammar file. +* Language and Grammar:: Languages and context-free grammars, + as mathematical ideas. +* Grammar in Bison:: How we represent grammars for Bison's sake. +* Semantic Values:: Each token or syntactic grouping can have + a semantic value (the value of an integer, + the name of an identifier, etc.). +* Semantic Actions:: Each rule can have an action containing C code. +* GLR Parsers:: Writing parsers for general context-free languages. +* Locations Overview:: Tracking Locations. +* Bison Parser:: What are Bison's input and output, + how is the output used? +* Stages:: Stages in writing and running Bison grammars. +* Grammar Layout:: Overall structure of a Bison grammar file. @end menu @node Language and Grammar @@ -745,10 +750,10 @@ user-defined function on the resulting values to produce an arbitrary merged result. @menu -* Simple GLR Parsers:: Using @acronym{GLR} parsers on unambiguous grammars. -* Merging GLR Parses:: Using @acronym{GLR} parsers to resolve ambiguities. -* GLR Semantic Actions:: Deferred semantic actions have special concerns. -* Compiler Requirements:: @acronym{GLR} parsers require a modern C compiler. +* Simple GLR Parsers:: Using @acronym{GLR} parsers on unambiguous grammars. +* Merging GLR Parses:: Using @acronym{GLR} parsers to resolve ambiguities. +* GLR Semantic Actions:: Deferred semantic actions have special concerns. +* Compiler Requirements:: @acronym{GLR} parsers require a modern C compiler. @end menu @node Simple GLR Parsers @@ -1376,15 +1381,15 @@ languages are written the same way. You can copy these examples into a source file to try them. @menu -* RPN Calc:: Reverse polish notation calculator; - a first example with no operator precedence. -* Infix Calc:: Infix (algebraic) notation calculator. - Operator precedence is introduced. +* RPN Calc:: Reverse polish notation calculator; + a first example with no operator precedence. +* Infix Calc:: Infix (algebraic) notation calculator. + Operator precedence is introduced. * Simple Error Recovery:: Continuing after syntax errors. * Location Tracking Calc:: Demonstrating the use of @@@var{n} and @@$. -* Multi-function Calc:: Calculator with memory and trig functions. - It uses multiple data-types for semantic values. -* Exercises:: Ideas for improving the multi-function calculator. +* Multi-function Calc:: Calculator with memory and trig functions. + It uses multiple data-types for semantic values. +* Exercises:: Ideas for improving the multi-function calculator. @end menu @node RPN Calc @@ -1403,16 +1408,16 @@ The source code for this calculator is named @file{rpcalc.y}. The @samp{.y} extension is a convention used for Bison input files. @menu -* Decls: Rpcalc Decls. Prologue (declarations) for rpcalc. -* Rules: Rpcalc Rules. Grammar Rules for rpcalc, with explanation. -* Lexer: Rpcalc Lexer. The lexical analyzer. -* Main: Rpcalc Main. The controlling function. -* Error: Rpcalc Error. The error reporting function. -* Gen: Rpcalc Gen. Running Bison on the grammar file. -* Comp: Rpcalc Compile. Run the C compiler on the output code. +* Rpcalc Declarations:: Prologue (declarations) for rpcalc. +* Rpcalc Rules:: Grammar Rules for rpcalc, with explanation. +* Rpcalc Lexer:: The lexical analyzer. +* Rpcalc Main:: The controlling function. +* Rpcalc Error:: The error reporting function. +* Rpcalc Generate:: Running Bison on the grammar file. +* Rpcalc Compile:: Run the C compiler on the output code. @end menu -@node Rpcalc Decls +@node Rpcalc Declarations @subsection Declarations for @code{rpcalc} Here are the C and Bison declarations for the reverse polish notation @@ -1662,7 +1667,7 @@ therefore, @code{NUM} becomes a macro for @code{yylex} to use. The semantic value of the token (if it has one) is stored into the global variable @code{yylval}, which is where the Bison parser will look for it. (The C data type of @code{yylval} is @code{YYSTYPE}, which was -defined at the beginning of the grammar; @pxref{Rpcalc Decls, +defined at the beginning of the grammar; @pxref{Rpcalc Declarations, ,Declarations for @code{rpcalc}}.) A token type code of zero is returned if the end-of-input is encountered. @@ -1758,7 +1763,7 @@ have not written any error rules in this example, so any invalid input will cause the calculator program to exit. This is not clean behavior for a real calculator, but it is adequate for the first example. -@node Rpcalc Gen +@node Rpcalc Generate @subsection Running Bison to Make the Parser @cindex running Bison (introduction) @@ -1858,8 +1863,8 @@ parentheses nested to arbitrary depth. Here is the Bison code for %token NUM %left '-' '+' %left '*' '/' -%left NEG /* negation--unary minus */ -%right '^' /* exponentiation */ +%precedence NEG /* negation--unary minus */ +%right '^' /* exponentiation */ %% /* The grammar follows. */ input: /* empty */ @@ -1892,15 +1897,16 @@ In the second section (Bison declarations), @code{%left} declares token types and says they are left-associative operators. The declarations @code{%left} and @code{%right} (right associativity) take the place of @code{%token} which is used to declare a token type name without -associativity. (These tokens are single-character literals, which +associativity/precedence. (These tokens are single-character literals, which ordinarily don't need to be declared. We declare them here to specify -the associativity.) +the associativity/precedence.) Operator precedence is determined by the line ordering of the declarations; the higher the line number of the declaration (lower on the page or screen), the higher the precedence. Hence, exponentiation has the highest precedence, unary minus (@code{NEG}) is next, followed -by @samp{*} and @samp{/}, and so on. @xref{Precedence, ,Operator +by @samp{*} and @samp{/}, and so on. Unary minus is not associative, +only precedence matters (@code{%precedence}. @xref{Precedence, ,Operator Precedence}. The other important new feature is the @code{%prec} in the grammar @@ -1977,12 +1983,12 @@ most of the work needed to use locations will be done in the lexical analyzer. @menu -* Decls: Ltcalc Decls. Bison and C declarations for ltcalc. -* Rules: Ltcalc Rules. Grammar rules for ltcalc, with explanations. -* Lexer: Ltcalc Lexer. The lexical analyzer. +* Ltcalc Declarations:: Bison and C declarations for ltcalc. +* Ltcalc Rules:: Grammar rules for ltcalc, with explanations. +* Ltcalc Lexer:: The lexical analyzer. @end menu -@node Ltcalc Decls +@node Ltcalc Declarations @subsection Declarations for @code{ltcalc} The C and Bison declarations for the location tracking calculator are @@ -2003,7 +2009,7 @@ the same as the declarations for the infix notation calculator. %left '-' '+' %left '*' '/' -%left NEG +%precedence NEG %right '^' %% /* The grammar follows. */ @@ -2218,12 +2224,12 @@ $ Note that multiple assignment and nested function calls are permitted. @menu -* Decl: Mfcalc Decl. Bison declarations for multi-function calculator. -* Rules: Mfcalc Rules. Grammar rules for the calculator. -* Symtab: Mfcalc Symtab. Symbol table management subroutines. +* Mfcalc Declarations:: Bison declarations for multi-function calculator. +* Mfcalc Rules:: Grammar rules for the calculator. +* Mfcalc Symbol Table:: Symbol table management subroutines. @end menu -@node Mfcalc Decl +@node Mfcalc Declarations @subsection Declarations for @code{mfcalc} Here are the C and Bison declarations for the multi-function calculator. @@ -2251,8 +2257,8 @@ Here are the C and Bison declarations for the multi-function calculator. %right '=' %left '-' '+' %left '*' '/' -%left NEG /* negation--unary minus */ -%right '^' /* exponentiation */ +%precedence NEG /* negation--unary minus */ +%right '^' /* exponentiation */ @end group %% /* The grammar follows. */ @end smallexample @@ -2319,7 +2325,7 @@ exp: NUM @{ $$ = $1; @} %% @end smallexample -@node Mfcalc Symtab +@node Mfcalc Symbol Table @subsection The @code{mfcalc} Symbol Table @cindex symbol table example @@ -2632,11 +2638,11 @@ As a @acronym{GNU} extension, @samp{//} introduces a comment that continues until end of line. @menu -* Prologue:: Syntax and usage of the prologue. +* Prologue:: Syntax and usage of the prologue. * Prologue Alternatives:: Syntax and usage of alternatives to the prologue. -* Bison Declarations:: Syntax and usage of the Bison declarations section. -* Grammar Rules:: Syntax and usage of the grammar rules section. -* Epilogue:: Syntax and usage of the epilogue. +* Bison Declarations:: Syntax and usage of the Bison declarations section. +* Grammar Rules:: Syntax and usage of the grammar rules section. +* Epilogue:: Syntax and usage of the epilogue. @end menu @node Prologue @@ -4019,7 +4025,8 @@ Bison will convert this into a @code{#define} directive in the parser, so that the function @code{yylex} (if it is in this file) can use the name @var{name} to stand for this token type's code. -Alternatively, you can use @code{%left}, @code{%right}, or +Alternatively, you can use @code{%left}, @code{%right}, +@code{%precedence}, or @code{%nonassoc} instead of @code{%token}, if you wish to specify associativity and precedence. @xref{Precedence Decl, ,Operator Precedence}. @@ -4095,7 +4102,8 @@ of ``$end'': @cindex declaring operator precedence @cindex operator precedence, declaring -Use the @code{%left}, @code{%right} or @code{%nonassoc} declaration to +Use the @code{%left}, @code{%right}, @code{%nonassoc}, or +@code{%precedence} declaration to declare a token and specify its precedence and associativity, all at once. These are called @dfn{precedence declarations}. @xref{Precedence, ,Operator Precedence}, for general information on @@ -4131,6 +4139,10 @@ left-associativity (grouping @var{x} with @var{y} first) and means that @samp{@var{x} @var{op} @var{y} @var{op} @var{z}} is considered a syntax error. +@code{%precedence} gives only precedence to the @var{symbols}, and +defines no associativity at all. Use this to define precedence only, +and leave any potential conflict due to associativity enabled. + @item The precedence of an operator determines how it nests with other operators. All the tokens declared in a single precedence declaration have equal @@ -4838,7 +4850,7 @@ already defined, so that the debugging facilities are compiled. Define a variable to adjust Bison's behavior. The possible choices for @var{variable}, as well as their meanings, depend on the selected target language and/or the parser skeleton (@pxref{Decl -Summary,,%language}). +Summary,,%language}, @pxref{Decl Summary,,%skeleton}). Bison will warn if a @var{variable} is defined multiple times. @@ -5051,6 +5063,9 @@ chosen as if the input file were named @file{@var{prefix}.y}. Specify the programming language for the generated parser. Currently supported languages include C, C++, and Java. @var{language} is case-insensitive. + +This directive is experimental and its effect may be modified in future +releases. @end deffn @deffn {Directive} %locations @@ -5111,10 +5126,10 @@ Require a Version of Bison}. @deffn {Directive} %skeleton "@var{file}" Specify the skeleton to use. -You probably don't need this option unless you are developing Bison. -You should use @code{%language} if you want to specify the skeleton for a -different language, because it is clearer and because it will always choose the -correct skeleton for non-deterministic or push parsers. +@c You probably don't need this option unless you are developing Bison. +@c You should use @code{%language} if you want to specify the skeleton for a +@c different language, because it is clearer and because it will always choose the +@c correct skeleton for non-deterministic or push parsers. If @var{file} does not contain a @code{/}, @var{file} is the name of a skeleton file in the Bison installation directory. @@ -5218,19 +5233,17 @@ identifier (aside from those in this manual) in an action or in epilogue in the grammar file, you are likely to run into trouble. @menu -* Parser Function:: How to call @code{yyparse} and what it returns. -* Push Parser Function:: How to call @code{yypush_parse} and what it returns. -* Pull Parser Function:: How to call @code{yypull_parse} and what it returns. -* Parser Create Function:: How to call @code{yypstate_new} and what it - returns. -* Parser Delete Function:: How to call @code{yypstate_delete} and what it - returns. -* Lexical:: You must supply a function @code{yylex} - which reads tokens. -* Error Reporting:: You must supply a function @code{yyerror}. -* Action Features:: Special features for use in actions. -* Internationalization:: How to let the parser speak in the user's - native language. +* Parser Function:: How to call @code{yyparse} and what it returns. +* Push Parser Function:: How to call @code{yypush_parse} and what it returns. +* Pull Parser Function:: How to call @code{yypull_parse} and what it returns. +* Parser Create Function:: How to call @code{yypstate_new} and what it returns. +* Parser Delete Function:: How to call @code{yypstate_delete} and what it returns. +* Lexical:: You must supply a function @code{yylex} + which reads tokens. +* Error Reporting:: You must supply a function @code{yyerror}. +* Action Features:: Special features for use in actions. +* Internationalization:: How to let the parser speak in the user's + native language. @end menu @node Parser Function @@ -5397,13 +5410,13 @@ that need it. @xref{Invocation, ,Invoking Bison}. @menu * Calling Convention:: How @code{yyparse} calls @code{yylex}. -* Token Values:: How @code{yylex} must return the semantic value - of the token it has read. -* Token Locations:: How @code{yylex} must return the text location - (line number, etc.) of the token, if the - actions want that. -* Pure Calling:: How the calling convention differs - in a pure parser (@pxref{Pure Decl, ,A Pure (Reentrant) Parser}). +* Token Values:: How @code{yylex} must return the semantic value + of the token it has read. +* Token Locations:: How @code{yylex} must return the text location + (line number, etc.) of the token, if the + actions want that. +* Pure Calling:: How the calling convention differs in a pure parser + (@pxref{Pure Decl, ,A Pure (Reentrant) Parser}). @end menu @node Calling Convention @@ -6046,7 +6059,7 @@ This kind of parser is known in the literature as a bottom-up parser. * Contextual Precedence:: When an operator's precedence depends on context. * Parser States:: The parser is a finite-state-machine with stack. * Reduce/Reduce:: When two rules are applicable in the same situation. -* Mystery Conflicts:: Reduce/reduce conflicts that look unjustified. +* Mystery Conflicts:: Reduce/reduce conflicts that look unjustified. * Generalized LR Parsing:: Parsing arbitrary context-free grammars. * Memory Management:: What happens when memory is exhausted. How to avoid it. @end menu @@ -6218,7 +6231,8 @@ shift and when to reduce. @menu * Why Precedence:: An example showing why precedence is needed. -* Using Precedence:: How to specify precedence in Bison grammars. +* Using Precedence:: How to specify precedence and associativity. +* Precedence Only:: How to specify precedence only. * Precedence Examples:: How these features are used in the previous example. * How Precedence:: How they work. @end menu @@ -6273,8 +6287,9 @@ makes right-associativity. @node Using Precedence @subsection Specifying Operator Precedence @findex %left -@findex %right @findex %nonassoc +@findex %precedence +@findex %right Bison allows you to specify these choices with the operator precedence declarations @code{%left} and @code{%right}. Each such declaration @@ -6284,13 +6299,63 @@ those operators left-associative and the @code{%right} declaration makes them right-associative. A third alternative is @code{%nonassoc}, which declares that it is a syntax error to find the same operator twice ``in a row''. +The last alternative, @code{%precedence}, allows to define only +precedence and no associativity at all. As a result, any +associativity-related conflict that remains will be reported as an +compile-time error. The directive @code{%nonassoc} creates run-time +error: using the operator in a associative way is a syntax error. The +directive @code{%precedence} creates compile-time errors: an operator +@emph{can} be involved in an associativity-related conflict, contrary to +what expected the grammar author. The relative precedence of different operators is controlled by the -order in which they are declared. The first @code{%left} or -@code{%right} declaration in the file declares the operators whose +order in which they are declared. The first precedence/associativity +declaration in the file declares the operators whose precedence is lowest, the next such declaration declares the operators whose precedence is a little higher, and so on. +@node Precedence Only +@subsection Specifying Precedence Only +@findex %precedence + +Since @acronym{POSIX} Yacc defines only @code{%left}, @code{%right}, and +@code{%nonassoc}, which all defines precedence and associativity, little +attention is paid to the fact that precedence cannot be defined without +defining associativity. Yet, sometimes, when trying to solve a +conflict, precedence suffices. In such a case, using @code{%left}, +@code{%right}, or @code{%nonassoc} might hide future (associativity +related) conflicts that would remain hidden. + +The dangling @code{else} ambiguity (@pxref{Shift/Reduce, , Shift/Reduce +Conflicts}) can be solved explictly. This shift/reduce conflicts occurs +in the following situation, where the period denotes the current parsing +state: + +@example +if @var{e1} then if @var{e2} then @var{s1} . else @var{s2} +@end example + +The conflict involves the reduction of the rule @samp{IF expr THEN +stmt}, which precedence is by default that of its last token +(@code{THEN}), and the shifting of the token @code{ELSE}. The usual +disambiguation (attach the @code{else} to the closest @code{if}), +shifting must be preferred, i.e., the precedence of @code{ELSE} must be +higher than that of @code{THEN}. But neither is expected to be involved +in an associativity related conflict, which can be specified as follows. + +@example +%precedence THEN +%precedence ELSE +@end example + +The unary-minus is another typical example where associativity is +usually over-specified, see @ref{Infix Calc, , Infix Notation +Calculator: @code{calc}}. The @code{%left} directive is traditionaly +used to declare the precedence of @code{NEG}, which is more than needed +since it also defines its associativity. While this is harmless in the +traditional example, who knows how @code{NEG} might be used in future +evolutions of the grammar@dots{} + @node Precedence Examples @subsection Precedence Examples @@ -6352,8 +6417,8 @@ outlandish at first, but it is really very common. For example, a minus sign typically has a very high precedence as a unary operator, and a somewhat lower precedence (lower than multiplication) as a binary operator. -The Bison precedence declarations, @code{%left}, @code{%right} and -@code{%nonassoc}, can only be used once for a given token; so a token has +The Bison precedence declarations +can only be used once for a given token; so a token has only one precedence declared in this way. For context-dependent precedence, you need to use an additional mechanism: the @code{%prec} modifier for rules. @@ -7650,7 +7715,7 @@ standard I/O stream, the numeric code for the token type, and the token value (from @code{yylval}). Here is an example of @code{YYPRINT} suitable for the multi-function -calculator (@pxref{Mfcalc Decl, ,Declarations for @code{mfcalc}}): +calculator (@pxref{Mfcalc Declarations, ,Declarations for @code{mfcalc}}): @smallexample %@{ @@ -7830,6 +7895,11 @@ In the parser file, define the macro @code{YYDEBUG} to 1 if it is not already defined, so that the debugging facilities are compiled. @xref{Tracing, ,Tracing Your Parser}. +@item -D @var{name}[=@var{value}] +@itemx --define=@var{name}[=@var{value}] +Same as running @samp{%define @var{name} "@var{value}"} (@pxref{Decl +Summary, ,%define}). + @item -L @var{language} @itemx --language=@var{language} Specify the programming language for the generated parser, as if @@ -7837,6 +7907,9 @@ Specify the programming language for the generated parser, as if Summary}). Currently supported languages include C, C++, and Java. @var{language} is case-insensitive. +This option is experimental and its effect may be modified in future +releases. + @item --locations Pretend that @code{%locations} was specified. @xref{Decl Summary}. @@ -7858,10 +7931,10 @@ parser file, treating it as an independent source file in its own right. Specify the skeleton to use, similar to @code{%skeleton} (@pxref{Decl Summary, , Bison Declaration Summary}). -You probably don't need this option unless you are developing Bison. -You should use @option{--language} if you want to specify the skeleton for a -different language, because it is clearer and because it will always -choose the correct skeleton for non-deterministic or push parsers. +@c You probably don't need this option unless you are developing Bison. +@c You should use @option{--language} if you want to specify the skeleton for a +@c different language, because it is clearer and because it will always +@c choose the correct skeleton for non-deterministic or push parsers. If @var{file} does not contain a @code{/}, @var{file} is the name of a skeleton file in the Bison installation directory. @@ -7928,7 +8001,7 @@ Specify the @var{file} for the parser file. The other output files' names are constructed from @var{file} as described under the @samp{-v} and @samp{-d} options. -@item -g[@var{file}] +@item -g [@var{file}] @itemx --graph[=@var{file}] Output a graphical representation of the @acronym{LALR}(1) grammar automaton computed by Bison, in @uref{http://www.graphviz.org/, Graphviz} @@ -7937,7 +8010,7 @@ automaton computed by Bison, in @uref{http://www.graphviz.org/, Graphviz} If omitted and the grammar file is @file{foo.y}, the output file will be @file{foo.dot}. -@item -x[@var{file}] +@item -x [@var{file}] @itemx --xml[=@var{file}] Output an XML report of the @acronym{LALR}(1) automaton computed by Bison. @code{@var{file}} is optional. @@ -7950,12 +8023,11 @@ More user feedback will help to stabilize it.) @node Option Cross Key @section Option Cross Key -@c FIXME: How about putting the directives too? Here is a list of options, alphabetized by long option, to help you find the corresponding short option. -@multitable {@option{--defines=@var{defines-file}}} {@option{-b @var{file-prefix}XXX}} -@headitem Long Option @tab Short Option +@multitable {@option{--defines=@var{defines-file}}} {@option{-D @var{name}[=@var{value}]}} {@code{%nondeterministic-parser}} +@headitem Long Option @tab Short Option @tab Bison Directive @include cross-options.texi @end multitable @@ -8009,13 +8081,13 @@ int yyparse (void); @node C++ Bison Interface @subsection C++ Bison Interface -@c - %language "C++" +@c - %skeleton "lalr1.cc" @c - Always pure @c - initial action -The C++ @acronym{LALR}(1) parser is selected using the language directive, -@samp{%language "C++"}, or the synonymous command-line option -@option{--language=c++}. +The C++ @acronym{LALR}(1) parser is selected using the skeleton directive, +@samp{%skeleton "lalr1.c"}, or the synonymous command-line option +@option{--skeleton=lalr1.c}. @xref{Decl Summary}. When run, @command{bison} will create several entities in the @samp{yy} @@ -8409,7 +8481,7 @@ the grammar for. @comment file: calc++-parser.yy @example -%language "C++" /* -*- C++ -*- */ +%skeleton "lalr1.cc" /* -*- C++ -*- */ %require "@value{VERSION}" %defines %define parser_class_name "calcxx_parser" @@ -8549,6 +8621,7 @@ exp: exp '+' exp @{ $$ = $1 + $3; @} | exp '-' exp @{ $$ = $1 - $3; @} | exp '*' exp @{ $$ = $1 * $3; @} | exp '/' exp @{ $$ = $1 / $3; @} + | '(' exp ')' @{ $$ = $2; @} | "identifier" @{ $$ = driver.variables[*$1]; delete $1; @} | "number" @{ $$ = $1; @}; %% @@ -8653,7 +8726,7 @@ It is convenient to use a typedef to shorten typedef yy::calcxx_parser::token token; %@} /* Convert ints to the actual type of tokens. */ -[-+*/] return yy::calcxx_parser::token_type (yytext[0]); +[-+*/()] return yy::calcxx_parser::token_type (yytext[0]); ":=" return token::ASSIGN; @{int@} @{ errno = 0; @@ -8707,6 +8780,7 @@ The top level file, @file{calc++.cc}, poses no problem. int main (int argc, char *argv[]) @{ + int res = 0; calcxx_driver driver; for (++argv; argv[0]; ++argv) if (*argv == std::string ("-p")) @@ -8715,6 +8789,9 @@ main (int argc, char *argv[]) driver.trace_scanning = true; else if (!driver.parse (*argv)) std::cout << driver.result << std::endl; + else + res = 1; + return res; @} @end example @@ -8722,14 +8799,14 @@ main (int argc, char *argv[]) @section Java Parsers @menu -* Java Bison Interface:: Asking for Java parser generation -* Java Semantic Values:: %type and %token vs. Java -* Java Location Values:: The position and location classes -* Java Parser Interface:: Instantiating and running the parser -* Java Scanner Interface:: Specifying the scanner for the parser -* Java Action Features:: Special features for use in actions. -* Java Differences:: Differences between C/C++ and Java Grammars -* Java Declarations Summary:: List of Bison declarations used with Java +* Java Bison Interface:: Asking for Java parser generation +* Java Semantic Values:: %type and %token vs. Java +* Java Location Values:: The position and location classes +* Java Parser Interface:: Instantiating and running the parser +* Java Scanner Interface:: Specifying the scanner for the parser +* Java Action Features:: Special features for use in actions +* Java Differences:: Differences between C/C++ and Java Grammars +* Java Declarations Summary:: List of Bison declarations used with Java @end menu @node Java Bison Interface @@ -8770,15 +8847,21 @@ No header file can be generated for Java parsers. Do not use the @code{%defines} directive or the @option{-d}/@option{--defines} options. @c FIXME: Possible code change. -Currently, support for debugging and verbose errors are always compiled +Currently, support for debugging is always compiled in. Thus the @code{%debug} and @code{%token-table} directives and the @option{-t}/@option{--debug} and @option{-k}/@option{--token-table} options have no effect. This may change in the future to eliminate -unused code in the generated parser, so use @code{%debug} and -@code{%verbose-error} explicitly if needed. Also, in the future the +unused code in the generated parser, so use @code{%debug} explicitly +if needed. Also, in the future the @code{%token-table} directive might enable a public interface to access the token names and codes. +Getting a ``code too large'' error from the Java compiler means the code +hit the 64KB bytecode per method limination of the Java class file. +Try reducing the amount of code in actions and static initializers; +otherwise, report a bug so that the parser skeleton will be improved. + + @node Java Semantic Values @subsection Java Semantic Values @c - No %union, specify type in %type/%token. @@ -8851,7 +8934,7 @@ The first, inclusive, position of the range, and the first beyond. @end deftypeivar @deftypeop {Constructor} {Location} {} Location (Position @var{loc}) -Create a @code{Location} denoting an empty range located at a given point. +Create a @code{Location} denoting an empty range located at a given point. @end deftypeop @deftypeop {Constructor} {Location} {} Location (Position @var{begin}, Position @var{end}) @@ -8885,6 +8968,8 @@ according to the Java language specification, the name of the @file{.java} file should match the name of the class in this case. Similarly, you can use @code{abstract}, @code{final} and @code{strictfp} with the @code{%define} declaration to add other modifiers to the parser class. +A single @code{%define annotations "@var{annotations}"} directive can +be used to add any number of annotations to the parser class. The Java package name of the parser class can be specified using the @code{%define package} directive. The superclass and the implemented @@ -8898,21 +8983,19 @@ these inner class/interface, and the members described in the interface below, all the other members and fields are preceded with a @code{yy} or @code{YY} prefix to avoid clashes with user code. -@c FIXME: The following constants and variables are still undocumented: -@c @code{bisonVersion}, @code{bisonSkeleton} and @code{errorVerbose}. - The parser class can be extended using the @code{%parse-param} directive. Each occurrence of the directive will add a @code{protected final} field to the parser class, and an argument to its constructor, which initialize them automatically. -Token names defined by @code{%token} and the predefined @code{EOF} token -name are added as constant fields to the parser class. - @deftypeop {Constructor} {YYParser} {} YYParser (@var{lex_param}, @dots{}, @var{parse_param}, @dots{}) Build a new parser object with embedded @code{%code lexer}. There are no parameters, unless @code{%parse-param}s and/or @code{%lex-param}s are used. + +Use @code{%code init} for code added to the start of the constructor +body. This is especially useful to initialize superclasses. Use +@code{%define init_throws} to specify any uncatch exceptions. @end deftypeop @deftypeop {Constructor} {YYParser} {} YYParser (Lexer @var{lexer}, @var{parse_param}, @dots{}) @@ -8922,6 +9005,10 @@ additional parameters unless @code{%parse-param}s are used. If the scanner is defined by @code{%code lexer}, this constructor is declared @code{protected} and is called automatically with a scanner created with the correct @code{%lex-param}s. + +Use @code{%code init} for code added to the start of the constructor +body. This is especially useful to initialize superclasses. Use +@code{%define init_throws} to specify any uncatch exceptions. @end deftypeop @deftypemethod {YYParser} {boolean} parse () @@ -8929,6 +9016,21 @@ Run the syntactic analysis, and return @code{true} on success, @code{false} otherwise. @end deftypemethod +@deftypemethod {YYParser} {boolean} getErrorVerbose () +@deftypemethodx {YYParser} {void} setErrorVerbose (boolean @var{verbose}) +Get or set the option to produce verbose error messages. These are only +available with the @code{%error-verbose} directive, which also turn on +verbose error messages. +@end deftypemethod + +@deftypemethod {YYParser} {void} yyerror (String @var{msg}) +@deftypemethodx {YYParser} {void} yyerror (Position @var{pos}, String @var{msg}) +@deftypemethodx {YYParser} {void} yyerror (Location @var{loc}, String @var{msg}) +Print an error message using the @code{yyerror} method of the scanner +instance in use. The @code{Location} and @code{Position} parameters are +available only if location tracking is active. +@end deftypemethod + @deftypemethod {YYParser} {boolean} recovering () During the syntactic analysis, return @code{true} if recovering from a syntax error. @@ -8947,6 +9049,11 @@ Get or set the tracing level. Currently its value is either 0, no trace, or nonzero, full tracing. @end deftypemethod +@deftypecv {Constant} {YYParser} {String} {bisonVersion} +@deftypecvx {Constant} {YYParser} {String} {bisonSkeleton} +Identify the Bison version and skeleton used to generate this parser. +@end deftypecv + @node Java Scanner Interface @subsection Java Scanner Interface @@ -8957,7 +9064,9 @@ or nonzero, full tracing. There are two possible ways to interface a Bison-generated Java parser with a scanner: the scanner may be defined by @code{%code lexer}, or defined elsewhere. In either case, the scanner has to implement the -@code{Lexer} inner interface of the parser class. +@code{Lexer} inner interface of the parser class. This interface also +contain constants for all user-defined token names and the predefined +@code{EOF} token. In the first case, the body of the scanner class is placed in @code{%code lexer} blocks. If you want to pass parameters from the @@ -9065,12 +9174,12 @@ Return immediately from the parser, indicating success. @end deffn @deffn {Statement} {return YYERROR;} -Start error recovery without printing an error message. +Start error recovery without printing an error message. @xref{Error Recovery}. @end deffn @deffn {Statement} {return YYFAIL;} -Print an error message and start error recovery. +Print an error message and start error recovery. @xref{Error Recovery}. @end deffn @@ -9081,11 +9190,12 @@ operation. @xref{Error Recovery}. @end deftypefn -@deftypefn {Function} {protected void} yyerror (String msg) -@deftypefnx {Function} {protected void} yyerror (Position pos, String msg) -@deftypefnx {Function} {protected void} yyerror (Location loc, String msg) +@deftypefn {Function} {void} yyerror (String @var{msg}) +@deftypefnx {Function} {void} yyerror (Position @var{loc}, String @var{msg}) +@deftypefnx {Function} {void} yyerror (Location @var{loc}, String @var{msg}) Print an error message using the @code{yyerror} method of the scanner -instance in use. +instance in use. The @code{Location} and @code{Position} parameters are +available only if location tracking is active. @end deftypefn @@ -9201,6 +9311,11 @@ Code inserted just after the @code{package} declaration. @xref{Java Differences}. @end deffn +@deffn {Directive} {%code init} @{ @var{code} @dots{} @} +Code inserted at the beginning of the parser constructor body. +@xref{Java Parser Interface}. +@end deffn + @deffn {Directive} {%code lexer} @{ @var{code} @dots{} @} Code added to the body of a inner lexer class within the parser class. @xref{Java Scanner Interface}. @@ -9213,7 +9328,7 @@ Code (after the second @code{%%}) appended to the end of the file, @end deffn @deffn {Directive} %@{ @var{code} @dots{} %@} -Not supported. Use @code{%code import} instead. +Not supported. Use @code{%code imports} instead. @xref{Java Differences}. @end deffn @@ -9222,6 +9337,11 @@ Whether the parser class is declared @code{abstract}. Default is false. @xref{Java Bison Interface}. @end deffn +@deffn {Directive} {%define annotations} "@var{annotations}" +The Java annotations for the parser class. Default is none. +@xref{Java Bison Interface}. +@end deffn + @deffn {Directive} {%define extends} "@var{superclass}" The superclass of the parser class. Default is none. @xref{Java Bison Interface}. @@ -9238,6 +9358,12 @@ Default is none. @xref{Java Bison Interface}. @end deffn +@deffn {Directive} {%define init_throws} "@var{exceptions}" +The exceptions thrown by @code{%code init} from the parser class +constructor. Default is none. +@xref{Java Parser Interface}. +@end deffn + @deffn {Directive} {%define lex_throws} "@var{exceptions}" The exceptions thrown by the @code{yylex} method of the lexer, a comma-separated list. Default is @code{java.io.IOException}. @@ -9850,7 +9976,7 @@ Specify the programming language for the generated parser. @end deffn @deffn {Directive} %left -Bison declaration to assign left associativity to token(s). +Bison declaration to assign precedence and left associativity to token(s). @xref{Precedence Decl, ,Operator Precedence}. @end deffn @@ -9885,7 +10011,7 @@ parser file. @xref{Decl Summary}. @end deffn @deffn {Directive} %nonassoc -Bison declaration to assign nonassociativity to token(s). +Bison declaration to assign precedence and nonassociativity to token(s). @xref{Precedence Decl, ,Operator Precedence}. @end deffn @@ -9905,6 +10031,11 @@ Bison declaration to assign a precedence to a specific rule. @xref{Contextual Precedence, ,Context-Dependent Precedence}. @end deffn +@deffn {Directive} %precedence +Bison declaration to assign precedence to token(s), but no associativity +@xref{Precedence Decl, ,Operator Precedence}. +@end deffn + @deffn {Directive} %pure-parser Deprecated version of @code{%define api.pure} (@pxref{Decl Summary, ,%define}), for which Bison is more careful to warn about unreasonable usage. @@ -9916,7 +10047,7 @@ Require a Version of Bison}. @end deffn @deffn {Directive} %right -Bison declaration to assign right associativity to token(s). +Bison declaration to assign precedence and right associativity to token(s). @xref{Precedence Decl, ,Operator Precedence}. @end deffn @@ -10351,7 +10482,7 @@ grammatically indivisible. The piece of text it represents is a token. @c LocalWords: akim fn cp syncodeindex vr tp synindex dircategory direntry @c LocalWords: ifset vskip pt filll insertcopying sp ISBN Etienne Suvasa @c LocalWords: ifnottex yyparse detailmenu GLR RPN Calc var Decls Rpcalc -@c LocalWords: rpcalc Lexer Gen Comp Expr ltcalc mfcalc Decl Symtab yylex +@c LocalWords: rpcalc Lexer Expr ltcalc mfcalc yylex @c LocalWords: yyerror pxref LR yylval cindex dfn LALR samp gpl BNF xref @c LocalWords: const int paren ifnotinfo AC noindent emph expr stmt findex @c LocalWords: glr YYSTYPE TYPENAME prog dprec printf decl init stmtMerge