X-Git-Url: https://git.saurik.com/bison.git/blobdiff_plain/e1145ad8bd1f96458bae65e307c1b55445699883..e3c52a63624c102a41f60bc741294cd3f23ceb89:/doc/bison.texinfo?ds=sidebyside diff --git a/doc/bison.texinfo b/doc/bison.texinfo index 5b023fa9..302bc4a1 100644 --- a/doc/bison.texinfo +++ b/doc/bison.texinfo @@ -89,76 +89,76 @@ Cover art by Etienne Suvasa. @menu * Introduction:: * Conditions:: -* Copying:: The @acronym{GNU} General Public License says - how you can copy and share Bison +* Copying:: The @acronym{GNU} General Public License says + how you can copy and share Bison. Tutorial sections: -* Concepts:: Basic concepts for understanding Bison. -* Examples:: Three simple explained examples of using Bison. +* Concepts:: Basic concepts for understanding Bison. +* Examples:: Three simple explained examples of using Bison. Reference sections: -* Grammar File:: Writing Bison declarations and rules. -* Interface:: C-language interface to the parser function @code{yyparse}. -* Algorithm:: How the Bison parser works at run-time. -* Error Recovery:: Writing rules for error recovery. +* Grammar File:: Writing Bison declarations and rules. +* Interface:: C-language interface to the parser function @code{yyparse}. +* Algorithm:: How the Bison parser works at run-time. +* Error Recovery:: Writing rules for error recovery. * Context Dependency:: What to do if your language syntax is too - messy for Bison to handle straightforwardly. -* Debugging:: Understanding or debugging Bison parsers. -* Invocation:: How to run Bison (to produce the parser source file). -* Other Languages:: Creating C++ and Java parsers. -* FAQ:: Frequently Asked Questions -* Table of Symbols:: All the keywords of the Bison language are explained. -* Glossary:: Basic concepts are explained. -* Copying This Manual:: License for copying this manual. -* Index:: Cross-references to the text. + messy for Bison to handle straightforwardly. +* Debugging:: Understanding or debugging Bison parsers. +* Invocation:: How to run Bison (to produce the parser source file). +* Other Languages:: Creating C++ and Java parsers. +* FAQ:: Frequently Asked Questions +* Table of Symbols:: All the keywords of the Bison language are explained. +* Glossary:: Basic concepts are explained. +* Copying This Manual:: License for copying this manual. +* Index:: Cross-references to the text. @detailmenu --- The Detailed Node Listing --- The Concepts of Bison -* Language and Grammar:: Languages and context-free grammars, - as mathematical ideas. -* Grammar in Bison:: How we represent grammars for Bison's sake. -* Semantic Values:: Each token or syntactic grouping can have - a semantic value (the value of an integer, - the name of an identifier, etc.). -* Semantic Actions:: Each rule can have an action containing C code. -* GLR Parsers:: Writing parsers for general context-free languages. -* Locations Overview:: Tracking Locations. -* Bison Parser:: What are Bison's input and output, - how is the output used? -* Stages:: Stages in writing and running Bison grammars. -* Grammar Layout:: Overall structure of a Bison grammar file. +* Language and Grammar:: Languages and context-free grammars, + as mathematical ideas. +* Grammar in Bison:: How we represent grammars for Bison's sake. +* Semantic Values:: Each token or syntactic grouping can have + a semantic value (the value of an integer, + the name of an identifier, etc.). +* Semantic Actions:: Each rule can have an action containing C code. +* GLR Parsers:: Writing parsers for general context-free languages. +* Locations Overview:: Tracking Locations. +* Bison Parser:: What are Bison's input and output, + how is the output used? +* Stages:: Stages in writing and running Bison grammars. +* Grammar Layout:: Overall structure of a Bison grammar file. Writing @acronym{GLR} Parsers -* Simple GLR Parsers:: Using @acronym{GLR} parsers on unambiguous grammars. -* Merging GLR Parses:: Using @acronym{GLR} parsers to resolve ambiguities. -* GLR Semantic Actions:: Deferred semantic actions have special concerns. -* Compiler Requirements:: @acronym{GLR} parsers require a modern C compiler. +* Simple GLR Parsers:: Using @acronym{GLR} parsers on unambiguous grammars. +* Merging GLR Parses:: Using @acronym{GLR} parsers to resolve ambiguities. +* GLR Semantic Actions:: Deferred semantic actions have special concerns. +* Compiler Requirements:: @acronym{GLR} parsers require a modern C compiler. Examples -* RPN Calc:: Reverse polish notation calculator; - a first example with no operator precedence. -* Infix Calc:: Infix (algebraic) notation calculator. - Operator precedence is introduced. +* RPN Calc:: Reverse polish notation calculator; + a first example with no operator precedence. +* Infix Calc:: Infix (algebraic) notation calculator. + Operator precedence is introduced. * Simple Error Recovery:: Continuing after syntax errors. * Location Tracking Calc:: Demonstrating the use of @@@var{n} and @@$. -* Multi-function Calc:: Calculator with memory and trig functions. - It uses multiple data-types for semantic values. -* Exercises:: Ideas for improving the multi-function calculator. +* Multi-function Calc:: Calculator with memory and trig functions. + It uses multiple data-types for semantic values. +* Exercises:: Ideas for improving the multi-function calculator. Reverse Polish Notation Calculator -* Decls: Rpcalc Decls. Prologue (declarations) for rpcalc. -* Rules: Rpcalc Rules. Grammar Rules for rpcalc, with explanation. -* Lexer: Rpcalc Lexer. The lexical analyzer. -* Main: Rpcalc Main. The controlling function. -* Error: Rpcalc Error. The error reporting function. -* Gen: Rpcalc Gen. Running Bison on the grammar file. -* Comp: Rpcalc Compile. Run the C compiler on the output code. +* Rpcalc Declarations:: Prologue (declarations) for rpcalc. +* Rpcalc Rules:: Grammar Rules for rpcalc, with explanation. +* Rpcalc Lexer:: The lexical analyzer. +* Rpcalc Main:: The controlling function. +* Rpcalc Error:: The error reporting function. +* Rpcalc Generate:: Running Bison on the grammar file. +* Rpcalc Compile:: Run the C compiler on the output code. Grammar Rules for @code{rpcalc} @@ -168,15 +168,15 @@ Grammar Rules for @code{rpcalc} Location Tracking Calculator: @code{ltcalc} -* Decls: Ltcalc Decls. Bison and C declarations for ltcalc. -* Rules: Ltcalc Rules. Grammar rules for ltcalc, with explanations. -* Lexer: Ltcalc Lexer. The lexical analyzer. +* Ltcalc Declarations:: Bison and C declarations for ltcalc. +* Ltcalc Rules:: Grammar rules for ltcalc, with explanations. +* Ltcalc Lexer:: The lexical analyzer. Multi-Function Calculator: @code{mfcalc} -* Decl: Mfcalc Decl. Bison declarations for multi-function calculator. -* Rules: Mfcalc Rules. Grammar rules for the calculator. -* Symtab: Mfcalc Symtab. Symbol table management subroutines. +* Mfcalc Declarations:: Bison declarations for multi-function calculator. +* Mfcalc Rules:: Grammar rules for the calculator. +* Mfcalc Symbol Table:: Symbol table management subroutines. Bison Grammar Files @@ -191,11 +191,11 @@ Bison Grammar Files Outline of a Bison Grammar -* Prologue:: Syntax and usage of the prologue. +* Prologue:: Syntax and usage of the prologue. * Prologue Alternatives:: Syntax and usage of alternatives to the prologue. -* Bison Declarations:: Syntax and usage of the Bison declarations section. -* Grammar Rules:: Syntax and usage of the grammar rules section. -* Epilogue:: Syntax and usage of the epilogue. +* Bison Declarations:: Syntax and usage of the Bison declarations section. +* Grammar Rules:: Syntax and usage of the grammar rules section. +* Epilogue:: Syntax and usage of the epilogue. Defining Language Semantics @@ -230,24 +230,28 @@ Bison Declarations Parser C-Language Interface -* Parser Function:: How to call @code{yyparse} and what it returns. -* Lexical:: You must supply a function @code{yylex} - which reads tokens. -* Error Reporting:: You must supply a function @code{yyerror}. -* Action Features:: Special features for use in actions. -* Internationalization:: How to let the parser speak in the user's - native language. +* Parser Function:: How to call @code{yyparse} and what it returns. +* Push Parser Function:: How to call @code{yypush_parse} and what it returns. +* Pull Parser Function:: How to call @code{yypull_parse} and what it returns. +* Parser Create Function:: How to call @code{yypstate_new} and what it returns. +* Parser Delete Function:: How to call @code{yypstate_delete} and what it returns. +* Lexical:: You must supply a function @code{yylex} + which reads tokens. +* Error Reporting:: You must supply a function @code{yyerror}. +* Action Features:: Special features for use in actions. +* Internationalization:: How to let the parser speak in the user's + native language. The Lexical Analyzer Function @code{yylex} * Calling Convention:: How @code{yyparse} calls @code{yylex}. -* Token Values:: How @code{yylex} must return the semantic value - of the token it has read. -* Token Locations:: How @code{yylex} must return the text location - (line number, etc.) of the token, if the - actions want that. -* Pure Calling:: How the calling convention differs - in a pure parser (@pxref{Pure Decl, ,A Pure (Reentrant) Parser}). +* Token Values:: How @code{yylex} must return the semantic value + of the token it has read. +* Token Locations:: How @code{yylex} must return the text location + (line number, etc.) of the token, if the + actions want that. +* Pure Calling:: How the calling convention differs in a pure parser + (@pxref{Pure Decl, ,A Pure (Reentrant) Parser}). The Bison Parser Algorithm @@ -257,14 +261,15 @@ The Bison Parser Algorithm * Contextual Precedence:: When an operator's precedence depends on context. * Parser States:: The parser is a finite-state-machine with stack. * Reduce/Reduce:: When two rules are applicable in the same situation. -* Mystery Conflicts:: Reduce/reduce conflicts that look unjustified. +* Mystery Conflicts:: Reduce/reduce conflicts that look unjustified. * Generalized LR Parsing:: Parsing arbitrary context-free grammars. * Memory Management:: What happens when memory is exhausted. How to avoid it. Operator Precedence * Why Precedence:: An example showing why precedence is needed. -* Using Precedence:: How to specify precedence in Bison grammars. +* Using Precedence:: How to specify precedence and associativity. +* Precedence Only:: How to specify precedence only. * Precedence Examples:: How these features are used in the previous example. * How Precedence:: How they work. @@ -311,31 +316,33 @@ A Complete C++ Example Java Parsers -* Java Bison Interface:: Asking for Java parser generation -* Java Semantic Values:: %type and %token vs. Java -* Java Location Values:: The position and location classes -* Java Parser Interface:: Instantiating and running the parser -* Java Scanner Interface:: Java scanners, and pure parsers -* Java Differences:: Differences between C/C++ and Java Grammars +* Java Bison Interface:: Asking for Java parser generation +* Java Semantic Values:: %type and %token vs. Java +* Java Location Values:: The position and location classes +* Java Parser Interface:: Instantiating and running the parser +* Java Scanner Interface:: Specifying the scanner for the parser +* Java Action Features:: Special features for use in actions +* Java Differences:: Differences between C/C++ and Java Grammars +* Java Declarations Summary:: List of Bison declarations used with Java Frequently Asked Questions -* Memory Exhausted:: Breaking the Stack Limits -* How Can I Reset the Parser:: @code{yyparse} Keeps some State -* Strings are Destroyed:: @code{yylval} Loses Track of Strings -* Implementing Gotos/Loops:: Control Flow in the Calculator -* Multiple start-symbols:: Factoring closely related grammars -* Secure? Conform?:: Is Bison @acronym{POSIX} safe? -* I can't build Bison:: Troubleshooting -* Where can I find help?:: Troubleshouting -* Bug Reports:: Troublereporting -* Other Languages:: Parsers in Java and others -* Beta Testing:: Experimenting development versions -* Mailing Lists:: Meeting other Bison users +* Memory Exhausted:: Breaking the Stack Limits +* How Can I Reset the Parser:: @code{yyparse} Keeps some State +* Strings are Destroyed:: @code{yylval} Loses Track of Strings +* Implementing Gotos/Loops:: Control Flow in the Calculator +* Multiple start-symbols:: Factoring closely related grammars +* Secure? Conform?:: Is Bison @acronym{POSIX} safe? +* I can't build Bison:: Troubleshooting +* Where can I find help?:: Troubleshouting +* Bug Reports:: Troublereporting +* More Languages:: Parsers in C++, Java, and so on +* Beta Testing:: Experimenting development versions +* Mailing Lists:: Meeting other Bison users Copying This Manual -* Copying This Manual:: License for copying this manual. +* Copying This Manual:: License for copying this manual. @end detailmenu @end menu @@ -415,19 +422,19 @@ details of Bison will not make sense. If you do not already know how to use Bison or Yacc, we suggest you start by reading this chapter carefully. @menu -* Language and Grammar:: Languages and context-free grammars, - as mathematical ideas. -* Grammar in Bison:: How we represent grammars for Bison's sake. -* Semantic Values:: Each token or syntactic grouping can have - a semantic value (the value of an integer, - the name of an identifier, etc.). -* Semantic Actions:: Each rule can have an action containing C code. -* GLR Parsers:: Writing parsers for general context-free languages. -* Locations Overview:: Tracking Locations. -* Bison Parser:: What are Bison's input and output, - how is the output used? -* Stages:: Stages in writing and running Bison grammars. -* Grammar Layout:: Overall structure of a Bison grammar file. +* Language and Grammar:: Languages and context-free grammars, + as mathematical ideas. +* Grammar in Bison:: How we represent grammars for Bison's sake. +* Semantic Values:: Each token or syntactic grouping can have + a semantic value (the value of an integer, + the name of an identifier, etc.). +* Semantic Actions:: Each rule can have an action containing C code. +* GLR Parsers:: Writing parsers for general context-free languages. +* Locations Overview:: Tracking Locations. +* Bison Parser:: What are Bison's input and output, + how is the output used? +* Stages:: Stages in writing and running Bison grammars. +* Grammar Layout:: Overall structure of a Bison grammar file. @end menu @node Language and Grammar @@ -743,10 +750,10 @@ user-defined function on the resulting values to produce an arbitrary merged result. @menu -* Simple GLR Parsers:: Using @acronym{GLR} parsers on unambiguous grammars. -* Merging GLR Parses:: Using @acronym{GLR} parsers to resolve ambiguities. -* GLR Semantic Actions:: Deferred semantic actions have special concerns. -* Compiler Requirements:: @acronym{GLR} parsers require a modern C compiler. +* Simple GLR Parsers:: Using @acronym{GLR} parsers on unambiguous grammars. +* Merging GLR Parses:: Using @acronym{GLR} parsers to resolve ambiguities. +* GLR Semantic Actions:: Deferred semantic actions have special concerns. +* Compiler Requirements:: @acronym{GLR} parsers require a modern C compiler. @end menu @node Simple GLR Parsers @@ -1374,15 +1381,15 @@ languages are written the same way. You can copy these examples into a source file to try them. @menu -* RPN Calc:: Reverse polish notation calculator; - a first example with no operator precedence. -* Infix Calc:: Infix (algebraic) notation calculator. - Operator precedence is introduced. +* RPN Calc:: Reverse polish notation calculator; + a first example with no operator precedence. +* Infix Calc:: Infix (algebraic) notation calculator. + Operator precedence is introduced. * Simple Error Recovery:: Continuing after syntax errors. * Location Tracking Calc:: Demonstrating the use of @@@var{n} and @@$. -* Multi-function Calc:: Calculator with memory and trig functions. - It uses multiple data-types for semantic values. -* Exercises:: Ideas for improving the multi-function calculator. +* Multi-function Calc:: Calculator with memory and trig functions. + It uses multiple data-types for semantic values. +* Exercises:: Ideas for improving the multi-function calculator. @end menu @node RPN Calc @@ -1401,16 +1408,16 @@ The source code for this calculator is named @file{rpcalc.y}. The @samp{.y} extension is a convention used for Bison input files. @menu -* Decls: Rpcalc Decls. Prologue (declarations) for rpcalc. -* Rules: Rpcalc Rules. Grammar Rules for rpcalc, with explanation. -* Lexer: Rpcalc Lexer. The lexical analyzer. -* Main: Rpcalc Main. The controlling function. -* Error: Rpcalc Error. The error reporting function. -* Gen: Rpcalc Gen. Running Bison on the grammar file. -* Comp: Rpcalc Compile. Run the C compiler on the output code. +* Rpcalc Declarations:: Prologue (declarations) for rpcalc. +* Rpcalc Rules:: Grammar Rules for rpcalc, with explanation. +* Rpcalc Lexer:: The lexical analyzer. +* Rpcalc Main:: The controlling function. +* Rpcalc Error:: The error reporting function. +* Rpcalc Generate:: Running Bison on the grammar file. +* Rpcalc Compile:: Run the C compiler on the output code. @end menu -@node Rpcalc Decls +@node Rpcalc Declarations @subsection Declarations for @code{rpcalc} Here are the C and Bison declarations for the reverse polish notation @@ -1660,7 +1667,7 @@ therefore, @code{NUM} becomes a macro for @code{yylex} to use. The semantic value of the token (if it has one) is stored into the global variable @code{yylval}, which is where the Bison parser will look for it. (The C data type of @code{yylval} is @code{YYSTYPE}, which was -defined at the beginning of the grammar; @pxref{Rpcalc Decls, +defined at the beginning of the grammar; @pxref{Rpcalc Declarations, ,Declarations for @code{rpcalc}}.) A token type code of zero is returned if the end-of-input is encountered. @@ -1756,7 +1763,7 @@ have not written any error rules in this example, so any invalid input will cause the calculator program to exit. This is not clean behavior for a real calculator, but it is adequate for the first example. -@node Rpcalc Gen +@node Rpcalc Generate @subsection Running Bison to Make the Parser @cindex running Bison (introduction) @@ -1856,8 +1863,8 @@ parentheses nested to arbitrary depth. Here is the Bison code for %token NUM %left '-' '+' %left '*' '/' -%left NEG /* negation--unary minus */ -%right '^' /* exponentiation */ +%precedence NEG /* negation--unary minus */ +%right '^' /* exponentiation */ %% /* The grammar follows. */ input: /* empty */ @@ -1890,15 +1897,16 @@ In the second section (Bison declarations), @code{%left} declares token types and says they are left-associative operators. The declarations @code{%left} and @code{%right} (right associativity) take the place of @code{%token} which is used to declare a token type name without -associativity. (These tokens are single-character literals, which +associativity/precedence. (These tokens are single-character literals, which ordinarily don't need to be declared. We declare them here to specify -the associativity.) +the associativity/precedence.) Operator precedence is determined by the line ordering of the declarations; the higher the line number of the declaration (lower on the page or screen), the higher the precedence. Hence, exponentiation has the highest precedence, unary minus (@code{NEG}) is next, followed -by @samp{*} and @samp{/}, and so on. @xref{Precedence, ,Operator +by @samp{*} and @samp{/}, and so on. Unary minus is not associative, +only precedence matters (@code{%precedence}. @xref{Precedence, ,Operator Precedence}. The other important new feature is the @code{%prec} in the grammar @@ -1975,12 +1983,12 @@ most of the work needed to use locations will be done in the lexical analyzer. @menu -* Decls: Ltcalc Decls. Bison and C declarations for ltcalc. -* Rules: Ltcalc Rules. Grammar rules for ltcalc, with explanations. -* Lexer: Ltcalc Lexer. The lexical analyzer. +* Ltcalc Declarations:: Bison and C declarations for ltcalc. +* Ltcalc Rules:: Grammar rules for ltcalc, with explanations. +* Ltcalc Lexer:: The lexical analyzer. @end menu -@node Ltcalc Decls +@node Ltcalc Declarations @subsection Declarations for @code{ltcalc} The C and Bison declarations for the location tracking calculator are @@ -2001,7 +2009,7 @@ the same as the declarations for the infix notation calculator. %left '-' '+' %left '*' '/' -%left NEG +%precedence NEG %right '^' %% /* The grammar follows. */ @@ -2216,12 +2224,12 @@ $ Note that multiple assignment and nested function calls are permitted. @menu -* Decl: Mfcalc Decl. Bison declarations for multi-function calculator. -* Rules: Mfcalc Rules. Grammar rules for the calculator. -* Symtab: Mfcalc Symtab. Symbol table management subroutines. +* Mfcalc Declarations:: Bison declarations for multi-function calculator. +* Mfcalc Rules:: Grammar rules for the calculator. +* Mfcalc Symbol Table:: Symbol table management subroutines. @end menu -@node Mfcalc Decl +@node Mfcalc Declarations @subsection Declarations for @code{mfcalc} Here are the C and Bison declarations for the multi-function calculator. @@ -2249,8 +2257,8 @@ Here are the C and Bison declarations for the multi-function calculator. %right '=' %left '-' '+' %left '*' '/' -%left NEG /* negation--unary minus */ -%right '^' /* exponentiation */ +%precedence NEG /* negation--unary minus */ +%right '^' /* exponentiation */ @end group %% /* The grammar follows. */ @end smallexample @@ -2317,7 +2325,7 @@ exp: NUM @{ $$ = $1; @} %% @end smallexample -@node Mfcalc Symtab +@node Mfcalc Symbol Table @subsection The @code{mfcalc} Symbol Table @cindex symbol table example @@ -2630,11 +2638,11 @@ As a @acronym{GNU} extension, @samp{//} introduces a comment that continues until end of line. @menu -* Prologue:: Syntax and usage of the prologue. +* Prologue:: Syntax and usage of the prologue. * Prologue Alternatives:: Syntax and usage of alternatives to the prologue. -* Bison Declarations:: Syntax and usage of the Bison declarations section. -* Grammar Rules:: Syntax and usage of the grammar rules section. -* Epilogue:: Syntax and usage of the epilogue. +* Bison Declarations:: Syntax and usage of the Bison declarations section. +* Grammar Rules:: Syntax and usage of the grammar rules section. +* Epilogue:: Syntax and usage of the epilogue. @end menu @node Prologue @@ -4017,7 +4025,8 @@ Bison will convert this into a @code{#define} directive in the parser, so that the function @code{yylex} (if it is in this file) can use the name @var{name} to stand for this token type's code. -Alternatively, you can use @code{%left}, @code{%right}, or +Alternatively, you can use @code{%left}, @code{%right}, +@code{%precedence}, or @code{%nonassoc} instead of @code{%token}, if you wish to specify associativity and precedence. @xref{Precedence Decl, ,Operator Precedence}. @@ -4093,7 +4102,8 @@ of ``$end'': @cindex declaring operator precedence @cindex operator precedence, declaring -Use the @code{%left}, @code{%right} or @code{%nonassoc} declaration to +Use the @code{%left}, @code{%right}, @code{%nonassoc}, or +@code{%precedence} declaration to declare a token and specify its precedence and associativity, all at once. These are called @dfn{precedence declarations}. @xref{Precedence, ,Operator Precedence}, for general information on @@ -4129,6 +4139,10 @@ left-associativity (grouping @var{x} with @var{y} first) and means that @samp{@var{x} @var{op} @var{y} @var{op} @var{z}} is considered a syntax error. +@code{%precedence} gives only precedence to the @var{symbols}, and +defines no associativity at all. Use this to define precedence only, +and leave any potential conflict due to associativity enabled. + @item The precedence of an operator determines how it nests with other operators. All the tokens declared in a single precedence declaration have equal @@ -4836,7 +4850,7 @@ already defined, so that the debugging facilities are compiled. Define a variable to adjust Bison's behavior. The possible choices for @var{variable}, as well as their meanings, depend on the selected target language and/or the parser skeleton (@pxref{Decl -Summary,,%language}). +Summary,,%language}, @pxref{Decl Summary,,%skeleton}). Bison will warn if a @var{variable} is defined multiple times. @@ -5049,6 +5063,9 @@ chosen as if the input file were named @file{@var{prefix}.y}. Specify the programming language for the generated parser. Currently supported languages include C, C++, and Java. @var{language} is case-insensitive. + +This directive is experimental and its effect may be modified in future +releases. @end deffn @deffn {Directive} %locations @@ -5109,10 +5126,10 @@ Require a Version of Bison}. @deffn {Directive} %skeleton "@var{file}" Specify the skeleton to use. -You probably don't need this option unless you are developing Bison. -You should use @code{%language} if you want to specify the skeleton for a -different language, because it is clearer and because it will always choose the -correct skeleton for non-deterministic or push parsers. +@c You probably don't need this option unless you are developing Bison. +@c You should use @code{%language} if you want to specify the skeleton for a +@c different language, because it is clearer and because it will always choose the +@c correct skeleton for non-deterministic or push parsers. If @var{file} does not contain a @code{/}, @var{file} is the name of a skeleton file in the Bison installation directory. @@ -5216,19 +5233,17 @@ identifier (aside from those in this manual) in an action or in epilogue in the grammar file, you are likely to run into trouble. @menu -* Parser Function:: How to call @code{yyparse} and what it returns. -* Push Parser Function:: How to call @code{yypush_parse} and what it returns. -* Pull Parser Function:: How to call @code{yypull_parse} and what it returns. -* Parser Create Function:: How to call @code{yypstate_new} and what it - returns. -* Parser Delete Function:: How to call @code{yypstate_delete} and what it - returns. -* Lexical:: You must supply a function @code{yylex} - which reads tokens. -* Error Reporting:: You must supply a function @code{yyerror}. -* Action Features:: Special features for use in actions. -* Internationalization:: How to let the parser speak in the user's - native language. +* Parser Function:: How to call @code{yyparse} and what it returns. +* Push Parser Function:: How to call @code{yypush_parse} and what it returns. +* Pull Parser Function:: How to call @code{yypull_parse} and what it returns. +* Parser Create Function:: How to call @code{yypstate_new} and what it returns. +* Parser Delete Function:: How to call @code{yypstate_delete} and what it returns. +* Lexical:: You must supply a function @code{yylex} + which reads tokens. +* Error Reporting:: You must supply a function @code{yyerror}. +* Action Features:: Special features for use in actions. +* Internationalization:: How to let the parser speak in the user's + native language. @end menu @node Parser Function @@ -5395,13 +5410,13 @@ that need it. @xref{Invocation, ,Invoking Bison}. @menu * Calling Convention:: How @code{yyparse} calls @code{yylex}. -* Token Values:: How @code{yylex} must return the semantic value - of the token it has read. -* Token Locations:: How @code{yylex} must return the text location - (line number, etc.) of the token, if the - actions want that. -* Pure Calling:: How the calling convention differs - in a pure parser (@pxref{Pure Decl, ,A Pure (Reentrant) Parser}). +* Token Values:: How @code{yylex} must return the semantic value + of the token it has read. +* Token Locations:: How @code{yylex} must return the text location + (line number, etc.) of the token, if the + actions want that. +* Pure Calling:: How the calling convention differs in a pure parser + (@pxref{Pure Decl, ,A Pure (Reentrant) Parser}). @end menu @node Calling Convention @@ -6044,7 +6059,7 @@ This kind of parser is known in the literature as a bottom-up parser. * Contextual Precedence:: When an operator's precedence depends on context. * Parser States:: The parser is a finite-state-machine with stack. * Reduce/Reduce:: When two rules are applicable in the same situation. -* Mystery Conflicts:: Reduce/reduce conflicts that look unjustified. +* Mystery Conflicts:: Reduce/reduce conflicts that look unjustified. * Generalized LR Parsing:: Parsing arbitrary context-free grammars. * Memory Management:: What happens when memory is exhausted. How to avoid it. @end menu @@ -6216,7 +6231,8 @@ shift and when to reduce. @menu * Why Precedence:: An example showing why precedence is needed. -* Using Precedence:: How to specify precedence in Bison grammars. +* Using Precedence:: How to specify precedence and associativity. +* Precedence Only:: How to specify precedence only. * Precedence Examples:: How these features are used in the previous example. * How Precedence:: How they work. @end menu @@ -6271,8 +6287,9 @@ makes right-associativity. @node Using Precedence @subsection Specifying Operator Precedence @findex %left -@findex %right @findex %nonassoc +@findex %precedence +@findex %right Bison allows you to specify these choices with the operator precedence declarations @code{%left} and @code{%right}. Each such declaration @@ -6282,13 +6299,63 @@ those operators left-associative and the @code{%right} declaration makes them right-associative. A third alternative is @code{%nonassoc}, which declares that it is a syntax error to find the same operator twice ``in a row''. +The last alternative, @code{%precedence}, allows to define only +precedence and no associativity at all. As a result, any +associativity-related conflict that remains will be reported as an +compile-time error. The directive @code{%nonassoc} creates run-time +error: using the operator in a associative way is a syntax error. The +directive @code{%precedence} creates compile-time errors: an operator +@emph{can} be involved in an associativity-related conflict, contrary to +what expected the grammar author. The relative precedence of different operators is controlled by the -order in which they are declared. The first @code{%left} or -@code{%right} declaration in the file declares the operators whose +order in which they are declared. The first precedence/associativity +declaration in the file declares the operators whose precedence is lowest, the next such declaration declares the operators whose precedence is a little higher, and so on. +@node Precedence Only +@subsection Specifying Precedence Only +@findex %precedence + +Since @acronym{POSIX} Yacc defines only @code{%left}, @code{%right}, and +@code{%nonassoc}, which all defines precedence and associativity, little +attention is paid to the fact that precedence cannot be defined without +defining associativity. Yet, sometimes, when trying to solve a +conflict, precedence suffices. In such a case, using @code{%left}, +@code{%right}, or @code{%nonassoc} might hide future (associativity +related) conflicts that would remain hidden. + +The dangling @code{else} ambiguity (@pxref{Shift/Reduce, , Shift/Reduce +Conflicts}) can be solved explictly. This shift/reduce conflicts occurs +in the following situation, where the period denotes the current parsing +state: + +@example +if @var{e1} then if @var{e2} then @var{s1} . else @var{s2} +@end example + +The conflict involves the reduction of the rule @samp{IF expr THEN +stmt}, which precedence is by default that of its last token +(@code{THEN}), and the shifting of the token @code{ELSE}. The usual +disambiguation (attach the @code{else} to the closest @code{if}), +shifting must be preferred, i.e., the precedence of @code{ELSE} must be +higher than that of @code{THEN}. But neither is expected to be involved +in an associativity related conflict, which can be specified as follows. + +@example +%precedence THEN +%precedence ELSE +@end example + +The unary-minus is another typical example where associativity is +usually over-specified, see @ref{Infix Calc, , Infix Notation +Calculator: @code{calc}}. The @code{%left} directive is traditionaly +used to declare the precedence of @code{NEG}, which is more than needed +since it also defines its associativity. While this is harmless in the +traditional example, who knows how @code{NEG} might be used in future +evolutions of the grammar@dots{} + @node Precedence Examples @subsection Precedence Examples @@ -6350,8 +6417,8 @@ outlandish at first, but it is really very common. For example, a minus sign typically has a very high precedence as a unary operator, and a somewhat lower precedence (lower than multiplication) as a binary operator. -The Bison precedence declarations, @code{%left}, @code{%right} and -@code{%nonassoc}, can only be used once for a given token; so a token has +The Bison precedence declarations +can only be used once for a given token; so a token has only one precedence declared in this way. For context-dependent precedence, you need to use an additional mechanism: the @code{%prec} modifier for rules. @@ -7648,7 +7715,7 @@ standard I/O stream, the numeric code for the token type, and the token value (from @code{yylval}). Here is an example of @code{YYPRINT} suitable for the multi-function -calculator (@pxref{Mfcalc Decl, ,Declarations for @code{mfcalc}}): +calculator (@pxref{Mfcalc Declarations, ,Declarations for @code{mfcalc}}): @smallexample %@{ @@ -7828,6 +7895,11 @@ In the parser file, define the macro @code{YYDEBUG} to 1 if it is not already defined, so that the debugging facilities are compiled. @xref{Tracing, ,Tracing Your Parser}. +@item -D @var{name}[=@var{value}] +@itemx --define=@var{name}[=@var{value}] +Same as running @samp{%define @var{name} "@var{value}"} (@pxref{Decl +Summary, ,%define}). + @item -L @var{language} @itemx --language=@var{language} Specify the programming language for the generated parser, as if @@ -7835,6 +7907,9 @@ Specify the programming language for the generated parser, as if Summary}). Currently supported languages include C, C++, and Java. @var{language} is case-insensitive. +This option is experimental and its effect may be modified in future +releases. + @item --locations Pretend that @code{%locations} was specified. @xref{Decl Summary}. @@ -7856,10 +7931,10 @@ parser file, treating it as an independent source file in its own right. Specify the skeleton to use, similar to @code{%skeleton} (@pxref{Decl Summary, , Bison Declaration Summary}). -You probably don't need this option unless you are developing Bison. -You should use @option{--language} if you want to specify the skeleton for a -different language, because it is clearer and because it will always -choose the correct skeleton for non-deterministic or push parsers. +@c You probably don't need this option unless you are developing Bison. +@c You should use @option{--language} if you want to specify the skeleton for a +@c different language, because it is clearer and because it will always +@c choose the correct skeleton for non-deterministic or push parsers. If @var{file} does not contain a @code{/}, @var{file} is the name of a skeleton file in the Bison installation directory. @@ -7926,7 +8001,7 @@ Specify the @var{file} for the parser file. The other output files' names are constructed from @var{file} as described under the @samp{-v} and @samp{-d} options. -@item -g[@var{file}] +@item -g [@var{file}] @itemx --graph[=@var{file}] Output a graphical representation of the @acronym{LALR}(1) grammar automaton computed by Bison, in @uref{http://www.graphviz.org/, Graphviz} @@ -7935,7 +8010,7 @@ automaton computed by Bison, in @uref{http://www.graphviz.org/, Graphviz} If omitted and the grammar file is @file{foo.y}, the output file will be @file{foo.dot}. -@item -x[@var{file}] +@item -x [@var{file}] @itemx --xml[=@var{file}] Output an XML report of the @acronym{LALR}(1) automaton computed by Bison. @code{@var{file}} is optional. @@ -7948,12 +8023,11 @@ More user feedback will help to stabilize it.) @node Option Cross Key @section Option Cross Key -@c FIXME: How about putting the directives too? Here is a list of options, alphabetized by long option, to help you find the corresponding short option. -@multitable {@option{--defines=@var{defines-file}}} {@option{-b @var{file-prefix}XXX}} -@headitem Long Option @tab Short Option +@multitable {@option{--defines=@var{defines-file}}} {@option{-D @var{name}[=@var{value}]}} {@code{%nondeterministic-parser}} +@headitem Long Option @tab Short Option @tab Bison Directive @include cross-options.texi @end multitable @@ -8007,13 +8081,13 @@ int yyparse (void); @node C++ Bison Interface @subsection C++ Bison Interface -@c - %language "C++" +@c - %skeleton "lalr1.cc" @c - Always pure @c - initial action -The C++ @acronym{LALR}(1) parser is selected using the language directive, -@samp{%language "C++"}, or the synonymous command-line option -@option{--language=c++}. +The C++ @acronym{LALR}(1) parser is selected using the skeleton directive, +@samp{%skeleton "lalr1.c"}, or the synonymous command-line option +@option{--skeleton=lalr1.c}. @xref{Decl Summary}. When run, @command{bison} will create several entities in the @samp{yy} @@ -8407,7 +8481,7 @@ the grammar for. @comment file: calc++-parser.yy @example -%language "C++" /* -*- C++ -*- */ +%skeleton "lalr1.cc" /* -*- C++ -*- */ %require "@value{VERSION}" %defines %define parser_class_name "calcxx_parser" @@ -8547,6 +8621,7 @@ exp: exp '+' exp @{ $$ = $1 + $3; @} | exp '-' exp @{ $$ = $1 - $3; @} | exp '*' exp @{ $$ = $1 * $3; @} | exp '/' exp @{ $$ = $1 / $3; @} + | '(' exp ')' @{ $$ = $2; @} | "identifier" @{ $$ = driver.variables[*$1]; delete $1; @} | "number" @{ $$ = $1; @}; %% @@ -8651,7 +8726,7 @@ It is convenient to use a typedef to shorten typedef yy::calcxx_parser::token token; %@} /* Convert ints to the actual type of tokens. */ -[-+*/] return yy::calcxx_parser::token_type (yytext[0]); +[-+*/()] return yy::calcxx_parser::token_type (yytext[0]); ":=" return token::ASSIGN; @{int@} @{ errno = 0; @@ -8705,6 +8780,7 @@ The top level file, @file{calc++.cc}, poses no problem. int main (int argc, char *argv[]) @{ + int res = 0; calcxx_driver driver; for (++argv; argv[0]; ++argv) if (*argv == std::string ("-p")) @@ -8713,6 +8789,9 @@ main (int argc, char *argv[]) driver.trace_scanning = true; else if (!driver.parse (*argv)) std::cout << driver.result << std::endl; + else + res = 1; + return res; @} @end example @@ -8720,53 +8799,68 @@ main (int argc, char *argv[]) @section Java Parsers @menu -* Java Bison Interface:: Asking for Java parser generation -* Java Semantic Values:: %type and %token vs. Java -* Java Location Values:: The position and location classes -* Java Parser Interface:: Instantiating and running the parser -* Java Scanner Interface:: Java scanners, and pure parsers -* Java Differences:: Differences between C/C++ and Java Grammars +* Java Bison Interface:: Asking for Java parser generation +* Java Semantic Values:: %type and %token vs. Java +* Java Location Values:: The position and location classes +* Java Parser Interface:: Instantiating and running the parser +* Java Scanner Interface:: Specifying the scanner for the parser +* Java Action Features:: Special features for use in actions +* Java Differences:: Differences between C/C++ and Java Grammars +* Java Declarations Summary:: List of Bison declarations used with Java @end menu @node Java Bison Interface @subsection Java Bison Interface @c - %language "Java" -@c - initial action (The current Java interface is experimental and may evolve. More user feedback will help to stabilize it.) -The Java parser skeletons are selected using a language directive, -@samp{%language "Java"}, or the synonymous command-line option -@option{--language=java}. +The Java parser skeletons are selected using the @code{%language "Java"} +directive or the @option{-L java}/@option{--language=java} option. -When run, @command{bison} will create several entities whose name -starts with @samp{YY}. Use the @samp{%name-prefix} directive to -change the prefix, see @ref{Decl Summary}; classes can be placed -in an arbitrary Java package using a @samp{%define package} section. +@c FIXME: Documented bug. +When generating a Java parser, @code{bison @var{basename}.y} will create +a single Java source file named @file{@var{basename}.java}. Using an +input file without a @file{.y} suffix is currently broken. The basename +of the output file can be changed by the @code{%file-prefix} directive +or the @option{-p}/@option{--name-prefix} option. The entire output file +name can be changed by the @code{%output} directive or the +@option{-o}/@option{--output} option. The output file contains a single +class for the parser. -The parser class defines an inner class, @code{Location}, that is used -for location tracking. If the parser is pure, it also defines an -inner interface, @code{Lexer}; see @ref{Java Scanner Interface} for the -meaning of pure parsers when the Java language is chosen. Other than -these inner class/interface, and the members described in @ref{Java -Parser Interface}, all the other members and fields are preceded -with a @code{yy} prefix to avoid clashes with user code. - -No header file can be generated for Java parsers; you must not pass -@option{-d}/@option{--defines} to @command{bison}, nor use the -@samp{%defines} directive. +You can create documentation for generated parsers using Javadoc. -By default, the @samp{YYParser} class has package visibility. A -declaration @samp{%define "public"} will change to public visibility. -Remember that, according to the Java language specification, the name -of the @file{.java} file should match the name of the class in this -case. +Contrary to C parsers, Java parsers do not use global variables; the +state of the parser is always local to an instance of the parser class. +Therefore, all Java parsers are ``pure'', and the @code{%pure-parser} +and @code{%define api.pure} directives does not do anything when used in +Java. -Similarly, a declaration @samp{%define "abstract"} will make your -class abstract. +Push parsers are currently unsupported in Java and @code{%define +api.push_pull} have no effect. + +@acronym{GLR} parsers are currently unsupported in Java. Do not use the +@code{glr-parser} directive. + +No header file can be generated for Java parsers. Do not use the +@code{%defines} directive or the @option{-d}/@option{--defines} options. + +@c FIXME: Possible code change. +Currently, support for debugging is always compiled +in. Thus the @code{%debug} and @code{%token-table} directives and the +@option{-t}/@option{--debug} and @option{-k}/@option{--token-table} +options have no effect. This may change in the future to eliminate +unused code in the generated parser, so use @code{%debug} explicitly +if needed. Also, in the future the +@code{%token-table} directive might enable a public interface to +access the token names and codes. + +Getting a ``code too large'' error from the Java compiler means the code +hit the 64KB bytecode per method limination of the Java class file. +Try reducing the amount of code in actions and static initializers; +otherwise, report a bug so that the parser skeleton will be improved. -You can create documentation for generated parsers using Javadoc. @node Java Semantic Values @subsection Java Semantic Values @@ -8786,20 +8880,23 @@ semantic values' types (class names) should be specified in the By default, the semantic stack is declared to have @code{Object} members, which means that the class types you specify can be of any class. To improve the type safety of the parser, you can declare the common -superclass of all the semantic values using the @samp{%define} directive. -For example, after the following declaration: +superclass of all the semantic values using the @code{%define stype} +directive. For example, after the following declaration: @example -%define "stype" "ASTNode" +%define stype "ASTNode" @end example @noindent any @code{%type} or @code{%token} specifying a semantic type which is not a subclass of ASTNode, will cause a compile-time error. +@c FIXME: Documented bug. Types used in the directives may be qualified with a package name. Primitive data types are accepted for Java version 1.5 or later. Note that in this case the autoboxing feature of Java 1.5 will be used. +Generic types may not be used; this is due to a limitation in the +implementation of Bison, and may change in future releases. Java parsers do not support @code{%destructor}, since the language adopts garbage collection. The parser will try to hold references @@ -8822,20 +8919,29 @@ An auxiliary user-defined class defines a @dfn{position}, a single point in a file; Bison itself defines a class representing a @dfn{location}, a range composed of a pair of positions (possibly spanning several files). The location class is an inner class of the parser; the name -is @code{Location} by default, may also be renamed using @code{%define -"location_type" "@var{class-name}}. +is @code{Location} by default, and may also be renamed using +@code{%define location_type "@var{class-name}}. The location class treats the position as a completely opaque value. By default, the class name is @code{Position}, but this can be changed -with @code{%define "position_type" "@var{class-name}"}. +with @code{%define position_type "@var{class-name}"}. This class must +be supplied by the user. -@deftypemethod {Location} {Position} begin -@deftypemethodx {Location} {Position} end +@deftypeivar {Location} {Position} begin +@deftypeivarx {Location} {Position} end The first, inclusive, position of the range, and the first beyond. -@end deftypemethod +@end deftypeivar + +@deftypeop {Constructor} {Location} {} Location (Position @var{loc}) +Create a @code{Location} denoting an empty range located at a given point. +@end deftypeop + +@deftypeop {Constructor} {Location} {} Location (Position @var{begin}, Position @var{end}) +Create a @code{Location} from the endpoints of the range. +@end deftypeop -@deftypemethod {Location} {void} toString () +@deftypemethod {Location} {String} toString () Prints the range represented by the location. For this to work properly, the position class should override the @code{equals} and @code{toString} methods appropriately. @@ -8850,28 +8956,85 @@ properly, the position class should override the @code{equals} and @c debug_stream. @c - Reporting errors -The output file defines the parser class in the package optionally -indicated in the @code{%define package} section. The class name defaults -to @code{YYParser}. The @code{YY} prefix may be changed using -@samp{%name-prefix}; alternatively, you can use @samp{%define -"parser_class_name" "@var{name}"} to give a custom name to the class. -The interface of this class is detailed below. It can be extended using -the @code{%parse-param} directive; each occurrence of the directive will -add a field to the parser class, and an argument to its constructor. +The name of the generated parser class defaults to @code{YYParser}. The +@code{YY} prefix may be changed using the @code{%name-prefix} directive +or the @option{-p}/@option{--name-prefix} option. Alternatively, use +@code{%define parser_class_name "@var{name}"} to give a custom name to +the class. The interface of this class is detailed below. + +By default, the parser class has package visibility. A declaration +@code{%define public} will change to public visibility. Remember that, +according to the Java language specification, the name of the @file{.java} +file should match the name of the class in this case. Similarly, you can +use @code{abstract}, @code{final} and @code{strictfp} with the +@code{%define} declaration to add other modifiers to the parser class. +A single @code{%define annotations "@var{annotations}"} directive can +be used to add any number of annotations to the parser class. + +The Java package name of the parser class can be specified using the +@code{%define package} directive. The superclass and the implemented +interfaces of the parser class can be specified with the @code{%define +extends} and @code{%define implements} directives. -@deftypemethod {YYParser} {} YYParser (@var{type1} @var{arg1}, ...) -Build a new parser object. There are no arguments by default, unless -@samp{%parse-param @{@var{type1} @var{arg1}@}} was used. -@end deftypemethod +The parser class defines an inner class, @code{Location}, that is used +for location tracking (see @ref{Java Location Values}), and a inner +interface, @code{Lexer} (see @ref{Java Scanner Interface}). Other than +these inner class/interface, and the members described in the interface +below, all the other members and fields are preceded with a @code{yy} or +@code{YY} prefix to avoid clashes with user code. + +The parser class can be extended using the @code{%parse-param} +directive. Each occurrence of the directive will add a @code{protected +final} field to the parser class, and an argument to its constructor, +which initialize them automatically. + +@deftypeop {Constructor} {YYParser} {} YYParser (@var{lex_param}, @dots{}, @var{parse_param}, @dots{}) +Build a new parser object with embedded @code{%code lexer}. There are +no parameters, unless @code{%parse-param}s and/or @code{%lex-param}s are +used. + +Use @code{%code init} for code added to the start of the constructor +body. This is especially useful to initialize superclasses. Use +@code{%define init_throws} to specify any uncatch exceptions. +@end deftypeop + +@deftypeop {Constructor} {YYParser} {} YYParser (Lexer @var{lexer}, @var{parse_param}, @dots{}) +Build a new parser object using the specified scanner. There are no +additional parameters unless @code{%parse-param}s are used. + +If the scanner is defined by @code{%code lexer}, this constructor is +declared @code{protected} and is called automatically with a scanner +created with the correct @code{%lex-param}s. + +Use @code{%code init} for code added to the start of the constructor +body. This is especially useful to initialize superclasses. Use +@code{%define init_throws} to specify any uncatch exceptions. +@end deftypeop @deftypemethod {YYParser} {boolean} parse () Run the syntactic analysis, and return @code{true} on success, @code{false} otherwise. @end deftypemethod +@deftypemethod {YYParser} {boolean} getErrorVerbose () +@deftypemethodx {YYParser} {void} setErrorVerbose (boolean @var{verbose}) +Get or set the option to produce verbose error messages. These are only +available with the @code{%error-verbose} directive, which also turn on +verbose error messages. +@end deftypemethod + +@deftypemethod {YYParser} {void} yyerror (String @var{msg}) +@deftypemethodx {YYParser} {void} yyerror (Position @var{pos}, String @var{msg}) +@deftypemethodx {YYParser} {void} yyerror (Location @var{loc}, String @var{msg}) +Print an error message using the @code{yyerror} method of the scanner +instance in use. The @code{Location} and @code{Position} parameters are +available only if location tracking is active. +@end deftypemethod + @deftypemethod {YYParser} {boolean} recovering () During the syntactic analysis, return @code{true} if recovering -from a syntax error. @xref{Error Recovery}. +from a syntax error. +@xref{Error Recovery}. @end deftypemethod @deftypemethod {YYParser} {java.io.PrintStream} getDebugStream () @@ -8886,12 +9049,10 @@ Get or set the tracing level. Currently its value is either 0, no trace, or nonzero, full tracing. @end deftypemethod -@deftypemethod {YYParser} {void} error (Location @var{l}, String @var{m}) -The definition for this member function must be supplied by the user -in the same way as the scanner interface (@pxref{Java Scanner -Interface}); the parser uses it to report a parser error occurring at -@var{l}, described by @var{m}. -@end deftypemethod +@deftypecv {Constant} {YYParser} {String} {bisonVersion} +@deftypecvx {Constant} {YYParser} {String} {bisonSkeleton} +Identify the Bison version and skeleton used to generate this parser. +@end deftypecv @node Java Scanner Interface @@ -8900,27 +9061,18 @@ Interface}); the parser uses it to report a parser error occurring at @c - %lex-param @c - Lexer interface -Contrary to C parsers, Java parsers do not use global variables; the -state of the parser is always local to an instance of the parser class. -Therefore, all Java parsers are ``pure'', and the @code{%pure-parser} -directive does not do anything when used in Java. -@c FIXME: But a bit farther it is stated that -@c If @code{%pure-parser} is not specified, the lexer interface -@c resides in the same class (@code{YYParser}) as the Bison-generated -@c parser. The fields and methods that are provided to -@c this end are as follows. - -The scanner always resides in a separate class than the parser. -Still, there are two possible ways to interface a Bison-generated Java -parser with a scanner, that is, the scanner may reside in a separate file -than the Bison grammar, or in the same file. The interface -to the scanner is similar in the two cases. - -In the first case, where the scanner in the same file as the grammar, the -scanner code has to be placed in @code{%code lexer} blocks. If you want -to pass parameters from the parser constructor to the scanner constructor, -specify them with @code{%lex-param}; they are passed before -@code{%parse-param}s to the constructor. +There are two possible ways to interface a Bison-generated Java parser +with a scanner: the scanner may be defined by @code{%code lexer}, or +defined elsewhere. In either case, the scanner has to implement the +@code{Lexer} inner interface of the parser class. This interface also +contain constants for all user-defined token names and the predefined +@code{EOF} token. + +In the first case, the body of the scanner class is placed in +@code{%code lexer} blocks. If you want to pass parameters from the +parser constructor to the scanner constructor, specify them with +@code{%lex-param}; they are passed before @code{%parse-param}s to the +constructor. In the second case, the scanner has to implement the @code{Lexer} interface, which is defined within the parser class (e.g., @code{YYParser.Lexer}). @@ -8930,18 +9082,19 @@ case. In both cases, the scanner has to implement the following methods. -@deftypemethod {Lexer} {void} yyerror (Location @var{l}, String @var{m}) -As explained in @pxref{Java Parser Interface}, this method is defined -by the user to emit an error message. The first parameter is omitted -if location tracking is not active. Its type can be changed using -@samp{%define "location_type" "@var{class-name}".} +@deftypemethod {Lexer} {void} yyerror (Location @var{loc}, String @var{msg}) +This method is defined by the user to emit an error message. The first +parameter is omitted if location tracking is not active. Its type can be +changed using @code{%define location_type "@var{class-name}".} @end deftypemethod -@deftypemethod {Lexer} {int} yylex (@var{type1} @var{arg1}, ...) +@deftypemethod {Lexer} {int} yylex () Return the next token. Its type is the return value, its semantic value and location are saved and returned by the ther methods in the -interface. Invocations of @samp{%lex-param @{@var{type1} -@var{arg1}@}} yield additional arguments. +interface. + +Use @code{%define lex_throws} to specify any uncaught exceptions. +Default is @code{java.io.IOException}. @end deftypemethod @deftypemethod {Lexer} {Position} getStartPos () @@ -8950,53 +9103,101 @@ Return respectively the first position of the last token that @code{yylex} returned, and the first position beyond it. These methods are not needed unless location tracking is active. -The return type can be changed using @samp{%define "position_type" +The return type can be changed using @code{%define position_type "@var{class-name}".} @end deftypemethod @deftypemethod {Lexer} {Object} getLVal () Return the semantical value of the last token that yylex returned. -The return type can be changed using @samp{%define "stype" +The return type can be changed using @code{%define stype "@var{class-name}".} @end deftypemethod -The lexer interface resides in the same class (@code{YYParser}) as the -Bison-generated parser. -The fields and methods that are provided to this end are as follows. +@node Java Action Features +@subsection Special Features for Use in Java Actions -@deftypemethod {YYParser} {void} error (Location @var{l}, String @var{m}) -As already explained (@pxref{Java Parser Interface}), this method is defined -by the user to emit an error message. The first parameter is not used -unless location tracking is active. Its type can be changed using -@samp{%define "location_type" "@var{class-name}".} -@end deftypemethod +The following special constructs can be uses in Java actions. +Other analogous C action features are currently unavailable for Java. -@deftypemethod {YYParser} {int} yylex (@var{type1} @var{arg1}, ...) -Return the next token. Its type is the return value, its semantic -value and location are saved into @code{yylval}, @code{yystartpos}, -@code{yyendpos}. Invocations of @samp{%lex-param @{@var{type1} -@var{arg1}@}} yield additional arguments. -@end deftypemethod +Use @code{%define throws} to specify any uncaught exceptions from parser +actions, and initial actions specified by @code{%initial-action}. -@deftypecv {Field} {YYParser} Position yystartpos -@deftypecvx {Field} {YYParser} Position yyendpos -Contain respectively the first position of the last token that yylex -returned, and the first position beyond it. These methods are not -needed unless location tracking is active. +@defvar $@var{n} +The semantic value for the @var{n}th component of the current rule. +This may not be assigned to. +@xref{Java Semantic Values}. +@end defvar -The field's type can be changed using @samp{%define "position_type" -"@var{class-name}".} -@end deftypecv +@defvar $<@var{typealt}>@var{n} +Like @code{$@var{n}} but specifies a alternative type @var{typealt}. +@xref{Java Semantic Values}. +@end defvar -@deftypecv {Field} {YYParser} Object yylval -Return respectively the first position of the last token that yylex -returned, and the first position beyond it. +@defvar $$ +The semantic value for the grouping made by the current rule. As a +value, this is in the base type (@code{Object} or as specified by +@code{%define stype}) as in not cast to the declared subtype because +casts are not allowed on the left-hand side of Java assignments. +Use an explicit Java cast if the correct subtype is needed. +@xref{Java Semantic Values}. +@end defvar + +@defvar $<@var{typealt}>$ +Same as @code{$$} since Java always allow assigning to the base type. +Perhaps we should use this and @code{$<>$} for the value and @code{$$} +for setting the value but there is currently no easy way to distinguish +these constructs. +@xref{Java Semantic Values}. +@end defvar + +@defvar @@@var{n} +The location information of the @var{n}th component of the current rule. +This may not be assigned to. +@xref{Java Location Values}. +@end defvar + +@defvar @@$ +The location information of the grouping made by the current rule. +@xref{Java Location Values}. +@end defvar + +@deffn {Statement} {return YYABORT;} +Return immediately from the parser, indicating failure. +@xref{Java Parser Interface}. +@end deffn + +@deffn {Statement} {return YYACCEPT;} +Return immediately from the parser, indicating success. +@xref{Java Parser Interface}. +@end deffn + +@deffn {Statement} {return YYERROR;} +Start error recovery without printing an error message. +@xref{Error Recovery}. +@end deffn + +@deffn {Statement} {return YYFAIL;} +Print an error message and start error recovery. +@xref{Error Recovery}. +@end deffn + +@deftypefn {Function} {boolean} recovering () +Return whether error recovery is being done. In this state, the parser +reads token until it reaches a known state, and then restarts normal +operation. +@xref{Error Recovery}. +@end deftypefn + +@deftypefn {Function} {void} yyerror (String @var{msg}) +@deftypefnx {Function} {void} yyerror (Position @var{loc}, String @var{msg}) +@deftypefnx {Function} {void} yyerror (Location @var{loc}, String @var{msg}) +Print an error message using the @code{yyerror} method of the scanner +instance in use. The @code{Location} and @code{Position} parameters are +available only if location tracking is active. +@end deftypefn -The field's type can be changed using @samp{%define "stype" -"@var{class-name}".} -@end deftypecv @node Java Differences @subsection Differences between C/C++ and Java Grammars @@ -9013,6 +9214,7 @@ macros. Instead, they should be preceded by @code{return} when they appear in an action. The actual definition of these symbols is opaque to the Bison grammar, and it might change in the future. The only meaningful operation that you can do, is to return them. +See @pxref{Java Action Features}. Note that of these three symbols, only @code{YYACCEPT} and @code{YYABORT} will cause a return from the @code{yyparse} @@ -9020,6 +9222,17 @@ method@footnote{Java parsers include the actions in a separate method than @code{yyparse} in order to have an intuitive syntax that corresponds to these C macros.}. +@item +Java lacks unions, so @code{%union} has no effect. Instead, semantic +values have a common base type: @code{Object} or as specified by +@code{%define stype}. Angle backets on @code{%token}, @code{type}, +@code{$@var{n}} and @code{$$} specify subtypes rather than fields of +an union. The type of @code{$$}, even with angle brackets, is the base +type since Java casts are not allow on the left-hand side of assignments. +Also, @code{$@var{n}} and @code{@@@var{n}} are not allowed on the +left-hand side of assignments. See @pxref{Java Semantic Values} and +@pxref{Java Action Features}. + @item The prolog declarations have a different meaning than in C/C++ code. @table @asis @@ -9039,10 +9252,170 @@ Interface}). @end table Other @code{%code} blocks are not supported in Java parsers. +In particular, @code{%@{ @dots{} %@}} blocks should not be used +and may give an error in future versions of Bison. + The epilogue has the same meaning as in C/C++ code and it can -be used to define other classes used by the parser. +be used to define other classes used by the parser @emph{outside} +the parser class. @end itemize + +@node Java Declarations Summary +@subsection Java Declarations Summary + +This summary only include declarations specific to Java or have special +meaning when used in a Java parser. + +@deffn {Directive} {%language "Java"} +Generate a Java class for the parser. +@end deffn + +@deffn {Directive} %lex-param @{@var{type} @var{name}@} +A parameter for the lexer class defined by @code{%code lexer} +@emph{only}, added as parameters to the lexer constructor and the parser +constructor that @emph{creates} a lexer. Default is none. +@xref{Java Scanner Interface}. +@end deffn + +@deffn {Directive} %name-prefix "@var{prefix}" +The prefix of the parser class name @code{@var{prefix}Parser} if +@code{%define parser_class_name} is not used. Default is @code{YY}. +@xref{Java Bison Interface}. +@end deffn + +@deffn {Directive} %parse-param @{@var{type} @var{name}@} +A parameter for the parser class added as parameters to constructor(s) +and as fields initialized by the constructor(s). Default is none. +@xref{Java Parser Interface}. +@end deffn + +@deffn {Directive} %token <@var{type}> @var{token} @dots{} +Declare tokens. Note that the angle brackets enclose a Java @emph{type}. +@xref{Java Semantic Values}. +@end deffn + +@deffn {Directive} %type <@var{type}> @var{nonterminal} @dots{} +Declare the type of nonterminals. Note that the angle brackets enclose +a Java @emph{type}. +@xref{Java Semantic Values}. +@end deffn + +@deffn {Directive} %code @{ @var{code} @dots{} @} +Code appended to the inside of the parser class. +@xref{Java Differences}. +@end deffn + +@deffn {Directive} {%code imports} @{ @var{code} @dots{} @} +Code inserted just after the @code{package} declaration. +@xref{Java Differences}. +@end deffn + +@deffn {Directive} {%code init} @{ @var{code} @dots{} @} +Code inserted at the beginning of the parser constructor body. +@xref{Java Parser Interface}. +@end deffn + +@deffn {Directive} {%code lexer} @{ @var{code} @dots{} @} +Code added to the body of a inner lexer class within the parser class. +@xref{Java Scanner Interface}. +@end deffn + +@deffn {Directive} %% @var{code} @dots{} +Code (after the second @code{%%}) appended to the end of the file, +@emph{outside} the parser class. +@xref{Java Differences}. +@end deffn + +@deffn {Directive} %@{ @var{code} @dots{} %@} +Not supported. Use @code{%code imports} instead. +@xref{Java Differences}. +@end deffn + +@deffn {Directive} {%define abstract} +Whether the parser class is declared @code{abstract}. Default is false. +@xref{Java Bison Interface}. +@end deffn + +@deffn {Directive} {%define annotations} "@var{annotations}" +The Java annotations for the parser class. Default is none. +@xref{Java Bison Interface}. +@end deffn + +@deffn {Directive} {%define extends} "@var{superclass}" +The superclass of the parser class. Default is none. +@xref{Java Bison Interface}. +@end deffn + +@deffn {Directive} {%define final} +Whether the parser class is declared @code{final}. Default is false. +@xref{Java Bison Interface}. +@end deffn + +@deffn {Directive} {%define implements} "@var{interfaces}" +The implemented interfaces of the parser class, a comma-separated list. +Default is none. +@xref{Java Bison Interface}. +@end deffn + +@deffn {Directive} {%define init_throws} "@var{exceptions}" +The exceptions thrown by @code{%code init} from the parser class +constructor. Default is none. +@xref{Java Parser Interface}. +@end deffn + +@deffn {Directive} {%define lex_throws} "@var{exceptions}" +The exceptions thrown by the @code{yylex} method of the lexer, a +comma-separated list. Default is @code{java.io.IOException}. +@xref{Java Scanner Interface}. +@end deffn + +@deffn {Directive} {%define location_type} "@var{class}" +The name of the class used for locations (a range between two +positions). This class is generated as an inner class of the parser +class by @command{bison}. Default is @code{Location}. +@xref{Java Location Values}. +@end deffn + +@deffn {Directive} {%define package} "@var{package}" +The package to put the parser class in. Default is none. +@xref{Java Bison Interface}. +@end deffn + +@deffn {Directive} {%define parser_class_name} "@var{name}" +The name of the parser class. Default is @code{YYParser} or +@code{@var{name-prefix}Parser}. +@xref{Java Bison Interface}. +@end deffn + +@deffn {Directive} {%define position_type} "@var{class}" +The name of the class used for positions. This class must be supplied by +the user. Default is @code{Position}. +@xref{Java Location Values}. +@end deffn + +@deffn {Directive} {%define public} +Whether the parser class is declared @code{public}. Default is false. +@xref{Java Bison Interface}. +@end deffn + +@deffn {Directive} {%define stype} "@var{class}" +The base type of semantic values. Default is @code{Object}. +@xref{Java Semantic Values}. +@end deffn + +@deffn {Directive} {%define strictfp} +Whether the parser class is declared @code{strictfp}. Default is false. +@xref{Java Bison Interface}. +@end deffn + +@deffn {Directive} {%define throws} "@var{exceptions}" +The exceptions thrown by user-supplied parser actions and +@code{%initial-action}, a comma-separated list. Default is none. +@xref{Java Parser Interface}. +@end deffn + + @c ================================================= FAQ @node FAQ @@ -9603,7 +9976,7 @@ Specify the programming language for the generated parser. @end deffn @deffn {Directive} %left -Bison declaration to assign left associativity to token(s). +Bison declaration to assign precedence and left associativity to token(s). @xref{Precedence Decl, ,Operator Precedence}. @end deffn @@ -9638,7 +10011,7 @@ parser file. @xref{Decl Summary}. @end deffn @deffn {Directive} %nonassoc -Bison declaration to assign nonassociativity to token(s). +Bison declaration to assign precedence and nonassociativity to token(s). @xref{Precedence Decl, ,Operator Precedence}. @end deffn @@ -9658,6 +10031,11 @@ Bison declaration to assign a precedence to a specific rule. @xref{Contextual Precedence, ,Context-Dependent Precedence}. @end deffn +@deffn {Directive} %precedence +Bison declaration to assign precedence to token(s), but no associativity +@xref{Precedence Decl, ,Operator Precedence}. +@end deffn + @deffn {Directive} %pure-parser Deprecated version of @code{%define api.pure} (@pxref{Decl Summary, ,%define}), for which Bison is more careful to warn about unreasonable usage. @@ -9669,7 +10047,7 @@ Require a Version of Bison}. @end deffn @deffn {Directive} %right -Bison declaration to assign right associativity to token(s). +Bison declaration to assign precedence and right associativity to token(s). @xref{Precedence Decl, ,Operator Precedence}. @end deffn @@ -10104,7 +10482,7 @@ grammatically indivisible. The piece of text it represents is a token. @c LocalWords: akim fn cp syncodeindex vr tp synindex dircategory direntry @c LocalWords: ifset vskip pt filll insertcopying sp ISBN Etienne Suvasa @c LocalWords: ifnottex yyparse detailmenu GLR RPN Calc var Decls Rpcalc -@c LocalWords: rpcalc Lexer Gen Comp Expr ltcalc mfcalc Decl Symtab yylex +@c LocalWords: rpcalc Lexer Expr ltcalc mfcalc yylex @c LocalWords: yyerror pxref LR yylval cindex dfn LALR samp gpl BNF xref @c LocalWords: const int paren ifnotinfo AC noindent emph expr stmt findex @c LocalWords: glr YYSTYPE TYPENAME prog dprec printf decl init stmtMerge