X-Git-Url: https://git.saurik.com/bison.git/blobdiff_plain/b2a0b7ca70f9490693d517d10c9c4a64c9fc1af0..31f8b787c0f6c29b2684a400eaa6300b7a7b7448:/doc/bison.texinfo diff --git a/doc/bison.texinfo b/doc/bison.texinfo index 94a384e0..4027388d 100644 --- a/doc/bison.texinfo +++ b/doc/bison.texinfo @@ -30,25 +30,26 @@ @copying -This manual is for @acronym{GNU} Bison (version @value{VERSION}, -@value{UPDATED}), the @acronym{GNU} parser generator. +This manual (@value{UPDATED}) is for @acronym{GNU} Bison (version +@value{VERSION}), the @acronym{GNU} parser generator. -Copyright @copyright{} 1988, 1989, 1990, 1991, 1992, 1993, 1995, 1998, -1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006 Free Software Foundation, Inc. +Copyright @copyright{} 1988, 1989, 1990, 1991, 1992, 1993, 1995, 1998, 1999, +2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010 Free +Software Foundation, Inc. @quotation Permission is granted to copy, distribute and/or modify this document under the terms of the @acronym{GNU} Free Documentation License, -Version 1.2 or any later version published by the Free Software +Version 1.3 or any later version published by the Free Software Foundation; with no Invariant Sections, with the Front-Cover texts being ``A @acronym{GNU} Manual,'' and with the Back-Cover Texts as in (a) below. A copy of the license is included in the section entitled ``@acronym{GNU} Free Documentation License.'' -(a) The @acronym{FSF}'s Back-Cover Text is: ``You have freedom to copy -and modify this @acronym{GNU} Manual, like @acronym{GNU} software. -Copies published by the Free Software Foundation raise funds for -@acronym{GNU} development.'' +(a) The FSF's Back-Cover Text is: ``You have the freedom to copy and +modify this @acronym{GNU} manual. Buying copies from the @acronym{FSF} +supports it in developing @acronym{GNU} and promoting software +freedom.'' @end quotation @end copying @@ -88,76 +89,76 @@ Cover art by Etienne Suvasa. @menu * Introduction:: * Conditions:: -* Copying:: The @acronym{GNU} General Public License says - how you can copy and share Bison +* Copying:: The @acronym{GNU} General Public License says + how you can copy and share Bison. Tutorial sections: -* Concepts:: Basic concepts for understanding Bison. -* Examples:: Three simple explained examples of using Bison. +* Concepts:: Basic concepts for understanding Bison. +* Examples:: Three simple explained examples of using Bison. Reference sections: -* Grammar File:: Writing Bison declarations and rules. -* Interface:: C-language interface to the parser function @code{yyparse}. -* Algorithm:: How the Bison parser works at run-time. -* Error Recovery:: Writing rules for error recovery. +* Grammar File:: Writing Bison declarations and rules. +* Interface:: C-language interface to the parser function @code{yyparse}. +* Algorithm:: How the Bison parser works at run-time. +* Error Recovery:: Writing rules for error recovery. * Context Dependency:: What to do if your language syntax is too - messy for Bison to handle straightforwardly. -* Debugging:: Understanding or debugging Bison parsers. -* Invocation:: How to run Bison (to produce the parser source file). -* C++ Language Interface:: Creating C++ parser objects. -* FAQ:: Frequently Asked Questions -* Table of Symbols:: All the keywords of the Bison language are explained. -* Glossary:: Basic concepts are explained. -* Copying This Manual:: License for copying this manual. -* Index:: Cross-references to the text. + messy for Bison to handle straightforwardly. +* Debugging:: Understanding or debugging Bison parsers. +* Invocation:: How to run Bison (to produce the parser source file). +* Other Languages:: Creating C++ and Java parsers. +* FAQ:: Frequently Asked Questions +* Table of Symbols:: All the keywords of the Bison language are explained. +* Glossary:: Basic concepts are explained. +* Copying This Manual:: License for copying this manual. +* Index:: Cross-references to the text. @detailmenu --- The Detailed Node Listing --- The Concepts of Bison -* Language and Grammar:: Languages and context-free grammars, - as mathematical ideas. -* Grammar in Bison:: How we represent grammars for Bison's sake. -* Semantic Values:: Each token or syntactic grouping can have - a semantic value (the value of an integer, - the name of an identifier, etc.). -* Semantic Actions:: Each rule can have an action containing C code. -* GLR Parsers:: Writing parsers for general context-free languages. -* Locations Overview:: Tracking Locations. -* Bison Parser:: What are Bison's input and output, - how is the output used? -* Stages:: Stages in writing and running Bison grammars. -* Grammar Layout:: Overall structure of a Bison grammar file. +* Language and Grammar:: Languages and context-free grammars, + as mathematical ideas. +* Grammar in Bison:: How we represent grammars for Bison's sake. +* Semantic Values:: Each token or syntactic grouping can have + a semantic value (the value of an integer, + the name of an identifier, etc.). +* Semantic Actions:: Each rule can have an action containing C code. +* GLR Parsers:: Writing parsers for general context-free languages. +* Locations Overview:: Tracking Locations. +* Bison Parser:: What are Bison's input and output, + how is the output used? +* Stages:: Stages in writing and running Bison grammars. +* Grammar Layout:: Overall structure of a Bison grammar file. Writing @acronym{GLR} Parsers -* Simple GLR Parsers:: Using @acronym{GLR} parsers on unambiguous grammars. -* Merging GLR Parses:: Using @acronym{GLR} parsers to resolve ambiguities. -* GLR Semantic Actions:: Deferred semantic actions have special concerns. -* Compiler Requirements:: @acronym{GLR} parsers require a modern C compiler. +* Simple GLR Parsers:: Using @acronym{GLR} parsers on unambiguous grammars. +* Merging GLR Parses:: Using @acronym{GLR} parsers to resolve ambiguities. +* GLR Semantic Actions:: Deferred semantic actions have special concerns. +* Compiler Requirements:: @acronym{GLR} parsers require a modern C compiler. Examples -* RPN Calc:: Reverse polish notation calculator; - a first example with no operator precedence. -* Infix Calc:: Infix (algebraic) notation calculator. - Operator precedence is introduced. +* RPN Calc:: Reverse polish notation calculator; + a first example with no operator precedence. +* Infix Calc:: Infix (algebraic) notation calculator. + Operator precedence is introduced. * Simple Error Recovery:: Continuing after syntax errors. * Location Tracking Calc:: Demonstrating the use of @@@var{n} and @@$. -* Multi-function Calc:: Calculator with memory and trig functions. - It uses multiple data-types for semantic values. -* Exercises:: Ideas for improving the multi-function calculator. +* Multi-function Calc:: Calculator with memory and trig functions. + It uses multiple data-types for semantic values. +* Exercises:: Ideas for improving the multi-function calculator. Reverse Polish Notation Calculator -* Decls: Rpcalc Decls. Prologue (declarations) for rpcalc. -* Rules: Rpcalc Rules. Grammar Rules for rpcalc, with explanation. -* Lexer: Rpcalc Lexer. The lexical analyzer. -* Main: Rpcalc Main. The controlling function. -* Error: Rpcalc Error. The error reporting function. -* Gen: Rpcalc Gen. Running Bison on the grammar file. -* Comp: Rpcalc Compile. Run the C compiler on the output code. +* Rpcalc Declarations:: Prologue (declarations) for rpcalc. +* Rpcalc Rules:: Grammar Rules for rpcalc, with explanation. +* Rpcalc Lexer:: The lexical analyzer. +* Rpcalc Main:: The controlling function. +* Rpcalc Error:: The error reporting function. +* Rpcalc Generate:: Running Bison on the grammar file. +* Rpcalc Compile:: Run the C compiler on the output code. Grammar Rules for @code{rpcalc} @@ -167,15 +168,15 @@ Grammar Rules for @code{rpcalc} Location Tracking Calculator: @code{ltcalc} -* Decls: Ltcalc Decls. Bison and C declarations for ltcalc. -* Rules: Ltcalc Rules. Grammar rules for ltcalc, with explanations. -* Lexer: Ltcalc Lexer. The lexical analyzer. +* Ltcalc Declarations:: Bison and C declarations for ltcalc. +* Ltcalc Rules:: Grammar rules for ltcalc, with explanations. +* Ltcalc Lexer:: The lexical analyzer. Multi-Function Calculator: @code{mfcalc} -* Decl: Mfcalc Decl. Bison declarations for multi-function calculator. -* Rules: Mfcalc Rules. Grammar rules for the calculator. -* Symtab: Mfcalc Symtab. Symbol table management subroutines. +* Mfcalc Declarations:: Bison declarations for multi-function calculator. +* Mfcalc Rules:: Grammar rules for the calculator. +* Mfcalc Symbol Table:: Symbol table management subroutines. Bison Grammar Files @@ -190,10 +191,11 @@ Bison Grammar Files Outline of a Bison Grammar -* Prologue:: Syntax and usage of the prologue. -* Bison Declarations:: Syntax and usage of the Bison declarations section. -* Grammar Rules:: Syntax and usage of the grammar rules section. -* Epilogue:: Syntax and usage of the epilogue. +* Prologue:: Syntax and usage of the prologue. +* Prologue Alternatives:: Syntax and usage of alternatives to the prologue. +* Bison Declarations:: Syntax and usage of the Bison declarations section. +* Grammar Rules:: Syntax and usage of the grammar rules section. +* Epilogue:: Syntax and usage of the epilogue. Defining Language Semantics @@ -223,28 +225,33 @@ Bison Declarations * Expect Decl:: Suppressing warnings about parsing conflicts. * Start Decl:: Specifying the start symbol. * Pure Decl:: Requesting a reentrant parser. +* Push Decl:: Requesting a push parser. * Decl Summary:: Table of all Bison declarations. Parser C-Language Interface -* Parser Function:: How to call @code{yyparse} and what it returns. -* Lexical:: You must supply a function @code{yylex} - which reads tokens. -* Error Reporting:: You must supply a function @code{yyerror}. -* Action Features:: Special features for use in actions. -* Internationalization:: How to let the parser speak in the user's - native language. +* Parser Function:: How to call @code{yyparse} and what it returns. +* Push Parser Function:: How to call @code{yypush_parse} and what it returns. +* Pull Parser Function:: How to call @code{yypull_parse} and what it returns. +* Parser Create Function:: How to call @code{yypstate_new} and what it returns. +* Parser Delete Function:: How to call @code{yypstate_delete} and what it returns. +* Lexical:: You must supply a function @code{yylex} + which reads tokens. +* Error Reporting:: You must supply a function @code{yyerror}. +* Action Features:: Special features for use in actions. +* Internationalization:: How to let the parser speak in the user's + native language. The Lexical Analyzer Function @code{yylex} * Calling Convention:: How @code{yyparse} calls @code{yylex}. -* Token Values:: How @code{yylex} must return the semantic value - of the token it has read. -* Token Locations:: How @code{yylex} must return the text location - (line number, etc.) of the token, if the - actions want that. -* Pure Calling:: How the calling convention differs - in a pure parser (@pxref{Pure Decl, ,A Pure (Reentrant) Parser}). +* Token Values:: How @code{yylex} must return the semantic value + of the token it has read. +* Token Locations:: How @code{yylex} must return the text location + (line number, etc.) of the token, if the + actions want that. +* Pure Calling:: How the calling convention differs in a pure parser + (@pxref{Pure Decl, ,A Pure (Reentrant) Parser}). The Bison Parser Algorithm @@ -254,7 +261,7 @@ The Bison Parser Algorithm * Contextual Precedence:: When an operator's precedence depends on context. * Parser States:: The parser is a finite-state-machine with stack. * Reduce/Reduce:: When two rules are applicable in the same situation. -* Mystery Conflicts:: Reduce/reduce conflicts that look unjustified. +* Mystery Conflicts:: Reduce/reduce conflicts that look unjustified. * Generalized LR Parsing:: Parsing arbitrary context-free grammars. * Memory Management:: What happens when memory is exhausted. How to avoid it. @@ -284,10 +291,10 @@ Invoking Bison * Option Cross Key:: Alphabetical list of long options. * Yacc Library:: Yacc-compatible @code{yylex} and @code{main}. -C++ Language Interface +Parsers Written In Other Languages * C++ Parsers:: The interface to generate C++ parser classes -* A Complete C++ Example:: Demonstrating their use +* Java Parsers:: The interface to generate Java parser classes C++ Parsers @@ -296,6 +303,7 @@ C++ Parsers * C++ Location Values:: The position and location classes * C++ Parser Interface:: Instantiating and running the parser * C++ Scanner Interface:: Exchanges between yylex and parse +* A Complete C++ Example:: Demonstrating their use A Complete C++ Example @@ -305,24 +313,35 @@ A Complete C++ Example * Calc++ Scanner:: A pure C++ Flex scanner * Calc++ Top Level:: Conducting the band +Java Parsers + +* Java Bison Interface:: Asking for Java parser generation +* Java Semantic Values:: %type and %token vs. Java +* Java Location Values:: The position and location classes +* Java Parser Interface:: Instantiating and running the parser +* Java Scanner Interface:: Specifying the scanner for the parser +* Java Action Features:: Special features for use in actions +* Java Differences:: Differences between C/C++ and Java Grammars +* Java Declarations Summary:: List of Bison declarations used with Java + Frequently Asked Questions -* Memory Exhausted:: Breaking the Stack Limits -* How Can I Reset the Parser:: @code{yyparse} Keeps some State -* Strings are Destroyed:: @code{yylval} Loses Track of Strings -* Implementing Gotos/Loops:: Control Flow in the Calculator -* Multiple start-symbols:: Factoring closely related grammars -* Secure? Conform?:: Is Bison @acronym{POSIX} safe? -* I can't build Bison:: Troubleshooting -* Where can I find help?:: Troubleshouting -* Bug Reports:: Troublereporting -* Other Languages:: Parsers in Java and others -* Beta Testing:: Experimenting development versions -* Mailing Lists:: Meeting other Bison users +* Memory Exhausted:: Breaking the Stack Limits +* How Can I Reset the Parser:: @code{yyparse} Keeps some State +* Strings are Destroyed:: @code{yylval} Loses Track of Strings +* Implementing Gotos/Loops:: Control Flow in the Calculator +* Multiple start-symbols:: Factoring closely related grammars +* Secure? Conform?:: Is Bison @acronym{POSIX} safe? +* I can't build Bison:: Troubleshooting +* Where can I find help?:: Troubleshouting +* Bug Reports:: Troublereporting +* More Languages:: Parsers in C++, Java, and so on +* Beta Testing:: Experimenting development versions +* Mailing Lists:: Meeting other Bison users Copying This Manual -* GNU Free Documentation License:: License for copying this manual. +* Copying This Manual:: License for copying this manual. @end detailmenu @end menu @@ -390,7 +409,9 @@ inspecting the file for text beginning with ``As a special exception@dots{}''. The text spells out the exact terms of the exception. -@include gpl.texi +@node Copying +@unnumbered GNU GENERAL PUBLIC LICENSE +@include gpl-3.0.texi @node Concepts @chapter The Concepts of Bison @@ -400,19 +421,19 @@ details of Bison will not make sense. If you do not already know how to use Bison or Yacc, we suggest you start by reading this chapter carefully. @menu -* Language and Grammar:: Languages and context-free grammars, - as mathematical ideas. -* Grammar in Bison:: How we represent grammars for Bison's sake. -* Semantic Values:: Each token or syntactic grouping can have - a semantic value (the value of an integer, - the name of an identifier, etc.). -* Semantic Actions:: Each rule can have an action containing C code. -* GLR Parsers:: Writing parsers for general context-free languages. -* Locations Overview:: Tracking Locations. -* Bison Parser:: What are Bison's input and output, - how is the output used? -* Stages:: Stages in writing and running Bison grammars. -* Grammar Layout:: Overall structure of a Bison grammar file. +* Language and Grammar:: Languages and context-free grammars, + as mathematical ideas. +* Grammar in Bison:: How we represent grammars for Bison's sake. +* Semantic Values:: Each token or syntactic grouping can have + a semantic value (the value of an integer, + the name of an identifier, etc.). +* Semantic Actions:: Each rule can have an action containing C code. +* GLR Parsers:: Writing parsers for general context-free languages. +* Locations Overview:: Tracking Locations. +* Bison Parser:: What are Bison's input and output, + how is the output used? +* Stages:: Stages in writing and running Bison grammars. +* Grammar Layout:: Overall structure of a Bison grammar file. @end menu @node Language and Grammar @@ -728,10 +749,10 @@ user-defined function on the resulting values to produce an arbitrary merged result. @menu -* Simple GLR Parsers:: Using @acronym{GLR} parsers on unambiguous grammars. -* Merging GLR Parses:: Using @acronym{GLR} parsers to resolve ambiguities. -* GLR Semantic Actions:: Deferred semantic actions have special concerns. -* Compiler Requirements:: @acronym{GLR} parsers require a modern C compiler. +* Simple GLR Parsers:: Using @acronym{GLR} parsers on unambiguous grammars. +* Merging GLR Parses:: Using @acronym{GLR} parsers to resolve ambiguities. +* GLR Semantic Actions:: Deferred semantic actions have special concerns. +* Compiler Requirements:: @acronym{GLR} parsers require a modern C compiler. @end menu @node Simple GLR Parsers @@ -1359,15 +1380,15 @@ languages are written the same way. You can copy these examples into a source file to try them. @menu -* RPN Calc:: Reverse polish notation calculator; - a first example with no operator precedence. -* Infix Calc:: Infix (algebraic) notation calculator. - Operator precedence is introduced. +* RPN Calc:: Reverse polish notation calculator; + a first example with no operator precedence. +* Infix Calc:: Infix (algebraic) notation calculator. + Operator precedence is introduced. * Simple Error Recovery:: Continuing after syntax errors. * Location Tracking Calc:: Demonstrating the use of @@@var{n} and @@$. -* Multi-function Calc:: Calculator with memory and trig functions. - It uses multiple data-types for semantic values. -* Exercises:: Ideas for improving the multi-function calculator. +* Multi-function Calc:: Calculator with memory and trig functions. + It uses multiple data-types for semantic values. +* Exercises:: Ideas for improving the multi-function calculator. @end menu @node RPN Calc @@ -1386,16 +1407,16 @@ The source code for this calculator is named @file{rpcalc.y}. The @samp{.y} extension is a convention used for Bison input files. @menu -* Decls: Rpcalc Decls. Prologue (declarations) for rpcalc. -* Rules: Rpcalc Rules. Grammar Rules for rpcalc, with explanation. -* Lexer: Rpcalc Lexer. The lexical analyzer. -* Main: Rpcalc Main. The controlling function. -* Error: Rpcalc Error. The error reporting function. -* Gen: Rpcalc Gen. Running Bison on the grammar file. -* Comp: Rpcalc Compile. Run the C compiler on the output code. +* Rpcalc Declarations:: Prologue (declarations) for rpcalc. +* Rpcalc Rules:: Grammar Rules for rpcalc, with explanation. +* Rpcalc Lexer:: The lexical analyzer. +* Rpcalc Main:: The controlling function. +* Rpcalc Error:: The error reporting function. +* Rpcalc Generate:: Running Bison on the grammar file. +* Rpcalc Compile:: Run the C compiler on the output code. @end menu -@node Rpcalc Decls +@node Rpcalc Declarations @subsection Declarations for @code{rpcalc} Here are the C and Bison declarations for the reverse polish notation @@ -1645,7 +1666,7 @@ therefore, @code{NUM} becomes a macro for @code{yylex} to use. The semantic value of the token (if it has one) is stored into the global variable @code{yylval}, which is where the Bison parser will look for it. (The C data type of @code{yylval} is @code{YYSTYPE}, which was -defined at the beginning of the grammar; @pxref{Rpcalc Decls, +defined at the beginning of the grammar; @pxref{Rpcalc Declarations, ,Declarations for @code{rpcalc}}.) A token type code of zero is returned if the end-of-input is encountered. @@ -1741,7 +1762,7 @@ have not written any error rules in this example, so any invalid input will cause the calculator program to exit. This is not clean behavior for a real calculator, but it is adequate for the first example. -@node Rpcalc Gen +@node Rpcalc Generate @subsection Running Bison to Make the Parser @cindex running Bison (introduction) @@ -1960,12 +1981,12 @@ most of the work needed to use locations will be done in the lexical analyzer. @menu -* Decls: Ltcalc Decls. Bison and C declarations for ltcalc. -* Rules: Ltcalc Rules. Grammar rules for ltcalc, with explanations. -* Lexer: Ltcalc Lexer. The lexical analyzer. +* Ltcalc Declarations:: Bison and C declarations for ltcalc. +* Ltcalc Rules:: Grammar rules for ltcalc, with explanations. +* Ltcalc Lexer:: The lexical analyzer. @end menu -@node Ltcalc Decls +@node Ltcalc Declarations @subsection Declarations for @code{ltcalc} The C and Bison declarations for the location tracking calculator are @@ -2047,7 +2068,7 @@ exp : NUM @{ $$ = $1; @} @} @end group @group - | '-' exp %preg NEG @{ $$ = -$2; @} + | '-' exp %prec NEG @{ $$ = -$2; @} | exp '^' exp @{ $$ = pow ($1, $3); @} | '(' exp ')' @{ $$ = $2; @} @end group @@ -2201,12 +2222,12 @@ $ Note that multiple assignment and nested function calls are permitted. @menu -* Decl: Mfcalc Decl. Bison declarations for multi-function calculator. -* Rules: Mfcalc Rules. Grammar rules for the calculator. -* Symtab: Mfcalc Symtab. Symbol table management subroutines. +* Mfcalc Declarations:: Bison declarations for multi-function calculator. +* Mfcalc Rules:: Grammar rules for the calculator. +* Mfcalc Symbol Table:: Symbol table management subroutines. @end menu -@node Mfcalc Decl +@node Mfcalc Declarations @subsection Declarations for @code{mfcalc} Here are the C and Bison declarations for the multi-function calculator. @@ -2302,7 +2323,7 @@ exp: NUM @{ $$ = $1; @} %% @end smallexample -@node Mfcalc Symtab +@node Mfcalc Symbol Table @subsection The @code{mfcalc} Symbol Table @cindex symbol table example @@ -2615,10 +2636,11 @@ As a @acronym{GNU} extension, @samp{//} introduces a comment that continues until end of line. @menu -* Prologue:: Syntax and usage of the prologue. -* Bison Declarations:: Syntax and usage of the Bison declarations section. -* Grammar Rules:: Syntax and usage of the grammar rules section. -* Epilogue:: Syntax and usage of the epilogue. +* Prologue:: Syntax and usage of the prologue. +* Prologue Alternatives:: Syntax and usage of alternatives to the prologue. +* Bison Declarations:: Syntax and usage of the Bison declarations section. +* Grammar Rules:: Syntax and usage of the grammar rules section. +* Epilogue:: Syntax and usage of the epilogue. @end menu @node Prologue @@ -2649,6 +2671,54 @@ can be done with two @var{Prologue} blocks, one before and one after the @smallexample %@{ + #define _GNU_SOURCE + #include + #include "ptypes.h" +%@} + +%union @{ + long int n; + tree t; /* @r{@code{tree} is defined in @file{ptypes.h}.} */ +@} + +%@{ + static void print_token_value (FILE *, int, YYSTYPE); + #define YYPRINT(F, N, L) print_token_value (F, N, L) +%@} + +@dots{} +@end smallexample + +When in doubt, it is usually safer to put prologue code before all +Bison declarations, rather than after. For example, any definitions +of feature test macros like @code{_GNU_SOURCE} or +@code{_POSIX_C_SOURCE} should appear before all Bison declarations, as +feature test macros can affect the behavior of Bison-generated +@code{#include} directives. + +@node Prologue Alternatives +@subsection Prologue Alternatives +@cindex Prologue Alternatives + +@findex %code +@findex %code requires +@findex %code provides +@findex %code top + +The functionality of @var{Prologue} sections can often be subtle and +inflexible. +As an alternative, Bison provides a %code directive with an explicit qualifier +field, which identifies the purpose of the code and thus the location(s) where +Bison should generate it. +For C/C++, the qualifier can be omitted for the default location, or it can be +one of @code{requires}, @code{provides}, @code{top}. +@xref{Decl Summary,,%code}. + +Look again at the example of the previous section: + +@smallexample +%@{ + #define _GNU_SOURCE #include #include "ptypes.h" %@} @@ -2666,22 +2736,163 @@ can be done with two @var{Prologue} blocks, one before and one after the @dots{} @end smallexample -@findex %before-header -@findex %start-header -@findex %after-header -If you've instructed Bison to generate a header file (@pxref{Table of Symbols, -,%defines}), you probably want @code{#include "ptypes.h"} to appear -in that header file as well. -In that case, use @code{%before-header}, @code{%start-header}, and -@code{%after-header} instead of @var{Prologue} sections -(@pxref{Table of Symbols, ,%start-header}): +@noindent +Notice that there are two @var{Prologue} sections here, but there's a subtle +distinction between their functionality. +For example, if you decide to override Bison's default definition for +@code{YYLTYPE}, in which @var{Prologue} section should you write your new +definition? +You should write it in the first since Bison will insert that code into the +parser source code file @emph{before} the default @code{YYLTYPE} definition. +In which @var{Prologue} section should you prototype an internal function, +@code{trace_token}, that accepts @code{YYLTYPE} and @code{yytokentype} as +arguments? +You should prototype it in the second since Bison will insert that code +@emph{after} the @code{YYLTYPE} and @code{yytokentype} definitions. + +This distinction in functionality between the two @var{Prologue} sections is +established by the appearance of the @code{%union} between them. +This behavior raises a few questions. +First, why should the position of a @code{%union} affect definitions related to +@code{YYLTYPE} and @code{yytokentype}? +Second, what if there is no @code{%union}? +In that case, the second kind of @var{Prologue} section is not available. +This behavior is not intuitive. + +To avoid this subtle @code{%union} dependency, rewrite the example using a +@code{%code top} and an unqualified @code{%code}. +Let's go ahead and add the new @code{YYLTYPE} definition and the +@code{trace_token} prototype at the same time: + +@smallexample +%code top @{ + #define _GNU_SOURCE + #include + + /* WARNING: The following code really belongs + * in a `%code requires'; see below. */ + + #include "ptypes.h" + #define YYLTYPE YYLTYPE + typedef struct YYLTYPE + @{ + int first_line; + int first_column; + int last_line; + int last_column; + char *filename; + @} YYLTYPE; +@} + +%union @{ + long int n; + tree t; /* @r{@code{tree} is defined in @file{ptypes.h}.} */ +@} + +%code @{ + static void print_token_value (FILE *, int, YYSTYPE); + #define YYPRINT(F, N, L) print_token_value (F, N, L) + static void trace_token (enum yytokentype token, YYLTYPE loc); +@} + +@dots{} +@end smallexample + +@noindent +In this way, @code{%code top} and the unqualified @code{%code} achieve the same +functionality as the two kinds of @var{Prologue} sections, but it's always +explicit which kind you intend. +Moreover, both kinds are always available even in the absence of @code{%union}. + +The @code{%code top} block above logically contains two parts. +The first two lines before the warning need to appear near the top of the +parser source code file. +The first line after the warning is required by @code{YYSTYPE} and thus also +needs to appear in the parser source code file. +However, if you've instructed Bison to generate a parser header file +(@pxref{Decl Summary, ,%defines}), you probably want that line to appear before +the @code{YYSTYPE} definition in that header file as well. +The @code{YYLTYPE} definition should also appear in the parser header file to +override the default @code{YYLTYPE} definition there. + +In other words, in the @code{%code top} block above, all but the first two +lines are dependency code required by the @code{YYSTYPE} and @code{YYLTYPE} +definitions. +Thus, they belong in one or more @code{%code requires}: + +@smallexample +%code top @{ + #define _GNU_SOURCE + #include +@} + +%code requires @{ + #include "ptypes.h" +@} +%union @{ + long int n; + tree t; /* @r{@code{tree} is defined in @file{ptypes.h}.} */ +@} + +%code requires @{ + #define YYLTYPE YYLTYPE + typedef struct YYLTYPE + @{ + int first_line; + int first_column; + int last_line; + int last_column; + char *filename; + @} YYLTYPE; +@} + +%code @{ + static void print_token_value (FILE *, int, YYSTYPE); + #define YYPRINT(F, N, L) print_token_value (F, N, L) + static void trace_token (enum yytokentype token, YYLTYPE loc); +@} + +@dots{} +@end smallexample + +@noindent +Now Bison will insert @code{#include "ptypes.h"} and the new @code{YYLTYPE} +definition before the Bison-generated @code{YYSTYPE} and @code{YYLTYPE} +definitions in both the parser source code file and the parser header file. +(By the same reasoning, @code{%code requires} would also be the appropriate +place to write your own definition for @code{YYSTYPE}.) + +When you are writing dependency code for @code{YYSTYPE} and @code{YYLTYPE}, you +should prefer @code{%code requires} over @code{%code top} regardless of whether +you instruct Bison to generate a parser header file. +When you are writing code that you need Bison to insert only into the parser +source code file and that has no special need to appear at the top of that +file, you should prefer the unqualified @code{%code} over @code{%code top}. +These practices will make the purpose of each block of your code explicit to +Bison and to other developers reading your grammar file. +Following these practices, we expect the unqualified @code{%code} and +@code{%code requires} to be the most important of the four @var{Prologue} +alternatives. + +At some point while developing your parser, you might decide to provide +@code{trace_token} to modules that are external to your parser. +Thus, you might wish for Bison to insert the prototype into both the parser +header file and the parser source code file. +Since this function is not a dependency required by @code{YYSTYPE} or +@code{YYLTYPE}, it doesn't make sense to move its prototype to a +@code{%code requires}. +More importantly, since it depends upon @code{YYLTYPE} and @code{yytokentype}, +@code{%code requires} is not sufficient. +Instead, move its prototype from the unqualified @code{%code} to a +@code{%code provides}: @smallexample -%before-header @{ +%code top @{ + #define _GNU_SOURCE #include @} -%start-header @{ +%code requires @{ #include "ptypes.h" @} %union @{ @@ -2689,7 +2900,23 @@ In that case, use @code{%before-header}, @code{%start-header}, and tree t; /* @r{@code{tree} is defined in @file{ptypes.h}.} */ @} -%after-header @{ +%code requires @{ + #define YYLTYPE YYLTYPE + typedef struct YYLTYPE + @{ + int first_line; + int first_column; + int last_line; + int last_column; + char *filename; + @} YYLTYPE; +@} + +%code provides @{ + void trace_token (enum yytokentype token, YYLTYPE loc); +@} + +%code @{ static void print_token_value (FILE *, int, YYSTYPE); #define YYPRINT(F, N, L) print_token_value (F, N, L) @} @@ -2697,6 +2924,62 @@ In that case, use @code{%before-header}, @code{%start-header}, and @dots{} @end smallexample +@noindent +Bison will insert the @code{trace_token} prototype into both the parser header +file and the parser source code file after the definitions for +@code{yytokentype}, @code{YYLTYPE}, and @code{YYSTYPE}. + +The above examples are careful to write directives in an order that reflects +the layout of the generated parser source code and header files: +@code{%code top}, @code{%code requires}, @code{%code provides}, and then +@code{%code}. +While your grammar files may generally be easier to read if you also follow +this order, Bison does not require it. +Instead, Bison lets you choose an organization that makes sense to you. + +You may declare any of these directives multiple times in the grammar file. +In that case, Bison concatenates the contained code in declaration order. +This is the only way in which the position of one of these directives within +the grammar file affects its functionality. + +The result of the previous two properties is greater flexibility in how you may +organize your grammar file. +For example, you may organize semantic-type-related directives by semantic +type: + +@smallexample +%code requires @{ #include "type1.h" @} +%union @{ type1 field1; @} +%destructor @{ type1_free ($$); @} +%printer @{ type1_print ($$); @} + +%code requires @{ #include "type2.h" @} +%union @{ type2 field2; @} +%destructor @{ type2_free ($$); @} +%printer @{ type2_print ($$); @} +@end smallexample + +@noindent +You could even place each of the above directive groups in the rules section of +the grammar file next to the set of rules that uses the associated semantic +type. +(In the rules section, you must terminate each of those directives with a +semicolon.) +And you don't have to worry that some directive (like a @code{%union}) in the +definitions section is going to adversely affect their functionality in some +counter-intuitive manner just because it comes first. +Such an organization is not possible using @var{Prologue} sections. + +This section has been concerned with explaining the advantages of the four +@var{Prologue} alternatives over the original Yacc @var{Prologue}. +However, in most cases when using these directives, you shouldn't need to +think about all the low-level ordering issues discussed here. +Instead, you should simply use these directives to label each block of your +code according to its purpose and let Bison handle the ordering. +@code{%code} is the most generic label. +Move code to @code{%code requires}, @code{%code provides}, or @code{%code top} +as needed. + @node Bison Declarations @subsection The Bison Declarations Section @cindex Bison declarations (introduction) @@ -3517,8 +3800,11 @@ typedef struct YYLTYPE @} YYLTYPE; @end example -At the beginning of the parsing, Bison initializes all these fields to 1 -for @code{yylloc}. +When @code{YYLTYPE} is not defined, at the beginning of the parsing, Bison +initializes all these fields to 1 for @code{yylloc}. To initialize +@code{yylloc} with a custom location type (or to chose a different +initialization), use the @code{%initial-action} directive. @xref{Initial +Action Decl, , Performing Actions before Parsing}. @node Actions and Locations @subsection Actions and Locations @@ -3702,6 +3988,7 @@ Grammars}). * Expect Decl:: Suppressing warnings about parsing conflicts. * Start Decl:: Specifying the start symbol. * Pure Decl:: Requesting a reentrant parser. +* Push Decl:: Requesting a push parser. * Decl Summary:: Table of all Bison declarations. @end menu @@ -3742,7 +4029,7 @@ associativity and precedence. @xref{Precedence Decl, ,Operator Precedence}. You can explicitly specify the numeric code for a token type by appending -a decimal or hexadecimal integer value in the field immediately +a nonnegative decimal or hexadecimal integer value in the field immediately following the token name: @example @@ -3795,6 +4082,16 @@ Once you equate the literal string and the token name, you can use them interchangeably in further declarations or the grammar rules. The @code{yylex} function can use the token name or the literal string to obtain the token type code number (@pxref{Calling Convention}). +Syntax error messages passed to @code{yyerror} from the parser will reference +the literal string instead of the token name. + +The token numbered as 0 corresponds to end of file; the following line +allows for nicer error messages referring to ``end of file'' instead +of ``$end'': + +@example +%token END 0 "end of file" +@end example @node Precedence Decl @subsection Operator Precedence @@ -3808,7 +4105,7 @@ once. These are called @dfn{precedence declarations}. @xref{Precedence, ,Operator Precedence}, for general information on operator precedence. -The syntax of a precedence declaration is the same as that of +The syntax of a precedence declaration is nearly the same as that of @code{%token}: either @example @@ -3846,6 +4143,18 @@ When two tokens declared in different precedence declarations associate, the one declared later has the higher precedence and is grouped first. @end itemize +For backward compatibility, there is a confusing difference between the +argument lists of @code{%token} and precedence declarations. +Only a @code{%token} can associate a literal string with a token type name. +A precedence declaration always interprets a literal string as a reference to a +separate token. +For example: + +@example +%left OR "<=" // Does not declare an alias. +%left OR 134 "<=" 135 // Declares 134 for OR and 135 for "<=". +@end example + @node Union Decl @subsection The Collection of Value Types @cindex declaring value types @@ -3986,8 +4295,8 @@ For instance, if your locations use a file name, you may use @subsection Freeing Discarded Symbols @cindex freeing discarded symbols @findex %destructor -@findex %symbol-default - +@findex <*> +@findex <> During error recovery (@pxref{Error Recovery}), symbols already pushed on the stack and tokens coming from the rest of the file are discarded until the parser falls on its feet. If the parser runs out of memory, @@ -4015,21 +4324,29 @@ The Parser Function @code{yyparse}}). When a symbol is listed among @var{symbols}, its @code{%destructor} is called a per-symbol @code{%destructor}. You may also define a per-type @code{%destructor} by listing a semantic type -among @var{symbols}. +tag among @var{symbols}. In that case, the parser will invoke this @var{code} whenever it discards any -grammar symbol that has that semantic type unless that symbol has its own +grammar symbol that has that semantic type tag unless that symbol has its own per-symbol @code{%destructor}. -Finally, you may define a default @code{%destructor} by placing -@code{%symbol-default} in the @var{symbols} list of exactly one -@code{%destructor} declaration in your grammar file. -In that case, the parser will invoke the associated @var{code} whenever it -discards any user-defined grammar symbol for which there is no per-type or -per-symbol @code{%destructor}. +Finally, you can define two different kinds of default @code{%destructor}s. +(These default forms are experimental. +More user feedback will help to determine whether they should become permanent +features.) +You can place each of @code{<*>} and @code{<>} in the @var{symbols} list of +exactly one @code{%destructor} declaration in your grammar file. +The parser will invoke the @var{code} associated with one of these whenever it +discards any user-defined grammar symbol that has no per-symbol and no per-type +@code{%destructor}. +The parser uses the @var{code} for @code{<*>} in the case of such a grammar +symbol for which you have formally declared a semantic type tag (@code{%type} +counts as such a declaration, but @code{$$} does not). +The parser uses the @var{code} for @code{<>} in the case of such a grammar +symbol that has no declared semantic type tag. @end deffn @noindent -For instance: +For example: @smallexample %union @{ char *string; @} @@ -4040,35 +4357,52 @@ For instance: %union @{ char character; @} %token CHR %type chr -%destructor @{ free ($$); @} %symbol-default -%destructor @{ free ($$); printf ("%d", @@$.first_line); @} STRING1 string1 +%token TAGLESS + %destructor @{ @} +%destructor @{ free ($$); @} <*> +%destructor @{ free ($$); printf ("%d", @@$.first_line); @} STRING1 string1 +%destructor @{ printf ("Discarding tagless symbol.\n"); @} <> @end smallexample @noindent guarantees that, when the parser discards any user-defined symbol that has a semantic type tag other than @code{}, it passes its semantic value -to @code{free}. +to @code{free} by default. However, when the parser discards a @code{STRING1} or a @code{string1}, it also prints its line number to @code{stdout}. It performs only the second @code{%destructor} in this case, so it invokes @code{free} only once. - -Notice that a Bison-generated parser invokes the default @code{%destructor} -only for user-defined as opposed to Bison-defined symbols. -For example, the parser will not invoke it for the special Bison-defined -symbols @code{$accept}, @code{$undefined}, or @code{$end} (@pxref{Table of -Symbols, ,Bison Symbols}), none of which you can reference in your grammar. -It also will not invoke it for the @code{error} token (@pxref{Table of Symbols, -,error}), which is always defined by Bison regardless of whether you reference -it in your grammar. -However, it will invoke it for the end token (token 0) if you redefine it from -@code{$end} to, for example, @code{END}: +Finally, the parser merely prints a message whenever it discards any symbol, +such as @code{TAGLESS}, that has no semantic type tag. + +A Bison-generated parser invokes the default @code{%destructor}s only for +user-defined as opposed to Bison-defined symbols. +For example, the parser will not invoke either kind of default +@code{%destructor} for the special Bison-defined symbols @code{$accept}, +@code{$undefined}, or @code{$end} (@pxref{Table of Symbols, ,Bison Symbols}), +none of which you can reference in your grammar. +It also will not invoke either for the @code{error} token (@pxref{Table of +Symbols, ,error}), which is always defined by Bison regardless of whether you +reference it in your grammar. +However, it may invoke one of them for the end token (token 0) if you +redefine it from @code{$end} to, for example, @code{END}: @smallexample %token END 0 @end smallexample +@cindex actions in mid-rule +@cindex mid-rule actions +Finally, Bison will never invoke a @code{%destructor} for an unreferenced +mid-rule semantic value (@pxref{Mid-Rule Actions,,Actions in Mid-Rule}). +That is, Bison does not consider a mid-rule to have a semantic value if you do +not reference @code{$$} in the mid-rule's action or @code{$@var{n}} (where +@var{n} is the RHS symbol position of the mid-rule) in any later action in that +rule. +However, if you do reference either, the Bison-generated parser will invoke the +@code{<>} @code{%destructor} whenever it discards the mid-rule symbol. + @ignore @noindent In the future, it may be possible to redefine the @code{error} token as a @@ -4097,7 +4431,7 @@ The parser can @dfn{return immediately} because of an explicit call to @code{YYABORT} or @code{YYACCEPT}, or failed error recovery, or memory exhaustion. -Right-hand size symbols of a rule that explicitly triggers a syntax +Right-hand side symbols of a rule that explicitly triggers a syntax error via @code{YYERROR} are not discarded automatically. As a rule of thumb, destructors are invoked only when user actions cannot manage the memory. @@ -4182,7 +4516,7 @@ may override this restriction with the @code{%start} declaration as follows: @subsection A Pure (Reentrant) Parser @cindex reentrant parser @cindex pure parser -@findex %pure-parser +@findex %define api.pure A @dfn{reentrant} program is one which does not alter in the course of execution; in other words, it consists entirely of @dfn{pure} (read-only) @@ -4198,19 +4532,20 @@ statically allocated variables for communication with @code{yylex}, including @code{yylval} and @code{yylloc}.) Alternatively, you can generate a pure, reentrant parser. The Bison -declaration @code{%pure-parser} says that you want the parser to be +declaration @code{%define api.pure} says that you want the parser to be reentrant. It looks like this: @example -%pure-parser +%define api.pure @end example The result is that the communication variables @code{yylval} and @code{yylloc} become local variables in @code{yyparse}, and a different calling convention is used for the lexical analyzer function @code{yylex}. @xref{Pure Calling, ,Calling Conventions for Pure -Parsers}, for the details of this. The variable @code{yynerrs} also -becomes local in @code{yyparse} (@pxref{Error Reporting, ,The Error +Parsers}, for the details of this. The variable @code{yynerrs} +becomes local in @code{yyparse} in pull mode but it becomes a member +of yypstate in push mode. (@pxref{Error Reporting, ,The Error Reporting Function @code{yyerror}}). The convention for calling @code{yyparse} itself is unchanged. @@ -4218,6 +4553,116 @@ Whether the parser is pure has nothing to do with the grammar rules. You can generate either a pure parser or a nonreentrant parser from any valid grammar. +@node Push Decl +@subsection A Push Parser +@cindex push parser +@cindex push parser +@findex %define api.push_pull + +(The current push parsing interface is experimental and may evolve. +More user feedback will help to stabilize it.) + +A pull parser is called once and it takes control until all its input +is completely parsed. A push parser, on the other hand, is called +each time a new token is made available. + +A push parser is typically useful when the parser is part of a +main event loop in the client's application. This is typically +a requirement of a GUI, when the main event loop needs to be triggered +within a certain time period. + +Normally, Bison generates a pull parser. +The following Bison declaration says that you want the parser to be a push +parser (@pxref{Decl Summary,,%define api.push_pull}): + +@example +%define api.push_pull "push" +@end example + +In almost all cases, you want to ensure that your push parser is also +a pure parser (@pxref{Pure Decl, ,A Pure (Reentrant) Parser}). The only +time you should create an impure push parser is to have backwards +compatibility with the impure Yacc pull mode interface. Unless you know +what you are doing, your declarations should look like this: + +@example +%define api.pure +%define api.push_pull "push" +@end example + +There is a major notable functional difference between the pure push parser +and the impure push parser. It is acceptable for a pure push parser to have +many parser instances, of the same type of parser, in memory at the same time. +An impure push parser should only use one parser at a time. + +When a push parser is selected, Bison will generate some new symbols in +the generated parser. @code{yypstate} is a structure that the generated +parser uses to store the parser's state. @code{yypstate_new} is the +function that will create a new parser instance. @code{yypstate_delete} +will free the resources associated with the corresponding parser instance. +Finally, @code{yypush_parse} is the function that should be called whenever a +token is available to provide the parser. A trivial example +of using a pure push parser would look like this: + +@example +int status; +yypstate *ps = yypstate_new (); +do @{ + status = yypush_parse (ps, yylex (), NULL); +@} while (status == YYPUSH_MORE); +yypstate_delete (ps); +@end example + +If the user decided to use an impure push parser, a few things about +the generated parser will change. The @code{yychar} variable becomes +a global variable instead of a variable in the @code{yypush_parse} function. +For this reason, the signature of the @code{yypush_parse} function is +changed to remove the token as a parameter. A nonreentrant push parser +example would thus look like this: + +@example +extern int yychar; +int status; +yypstate *ps = yypstate_new (); +do @{ + yychar = yylex (); + status = yypush_parse (ps); +@} while (status == YYPUSH_MORE); +yypstate_delete (ps); +@end example + +That's it. Notice the next token is put into the global variable @code{yychar} +for use by the next invocation of the @code{yypush_parse} function. + +Bison also supports both the push parser interface along with the pull parser +interface in the same generated parser. In order to get this functionality, +you should replace the @code{%define api.push_pull "push"} declaration with the +@code{%define api.push_pull "both"} declaration. Doing this will create all of +the symbols mentioned earlier along with the two extra symbols, @code{yyparse} +and @code{yypull_parse}. @code{yyparse} can be used exactly as it normally +would be used. However, the user should note that it is implemented in the +generated parser by calling @code{yypull_parse}. +This makes the @code{yyparse} function that is generated with the +@code{%define api.push_pull "both"} declaration slower than the normal +@code{yyparse} function. If the user +calls the @code{yypull_parse} function it will parse the rest of the input +stream. It is possible to @code{yypush_parse} tokens to select a subgrammar +and then @code{yypull_parse} the rest of the input stream. If you would like +to switch back and forth between between parsing styles, you would have to +write your own @code{yypull_parse} function that knows when to quit looking +for input. An example of using the @code{yypull_parse} function would look +like this: + +@example +yypstate *ps = yypstate_new (); +yypull_parse (ps); /* Will call the lexer */ +yypstate_delete (ps); +@end example + +Adding the @code{%define api.pure} declaration does exactly the same thing to +the generated parser with @code{%define api.push_pull "both"} as it did for +@code{%define api.push_pull "push"}. + @node Decl Summary @subsection Bison Declaration Summary @cindex Bison declaration summary @@ -4280,11 +4725,269 @@ Declare the expected number of shift-reduce conflicts In order to change the behavior of @command{bison}, use the following directives: +@deffn {Directive} %code @{@var{code}@} +@findex %code +This is the unqualified form of the @code{%code} directive. +It inserts @var{code} verbatim at a language-dependent default location in the +output@footnote{The default location is actually skeleton-dependent; + writers of non-standard skeletons however should choose the default location + consistently with the behavior of the standard Bison skeletons.}. + +@cindex Prologue +For C/C++, the default location is the parser source code +file after the usual contents of the parser header file. +Thus, @code{%code} replaces the traditional Yacc prologue, +@code{%@{@var{code}%@}}, for most purposes. +For a detailed discussion, see @ref{Prologue Alternatives}. + +For Java, the default location is inside the parser class. +@end deffn + +@deffn {Directive} %code @var{qualifier} @{@var{code}@} +This is the qualified form of the @code{%code} directive. +If you need to specify location-sensitive verbatim @var{code} that does not +belong at the default location selected by the unqualified @code{%code} form, +use this form instead. + +@var{qualifier} identifies the purpose of @var{code} and thus the location(s) +where Bison should generate it. +Not all values of @var{qualifier} are available for all target languages: + +@itemize @bullet +@item requires +@findex %code requires + +@itemize @bullet +@item Language(s): C, C++ + +@item Purpose: This is the best place to write dependency code required for +@code{YYSTYPE} and @code{YYLTYPE}. +In other words, it's the best place to define types referenced in @code{%union} +directives, and it's the best place to override Bison's default @code{YYSTYPE} +and @code{YYLTYPE} definitions. + +@item Location(s): The parser header file and the parser source code file +before the Bison-generated @code{YYSTYPE} and @code{YYLTYPE} definitions. +@end itemize + +@item provides +@findex %code provides + +@itemize @bullet +@item Language(s): C, C++ + +@item Purpose: This is the best place to write additional definitions and +declarations that should be provided to other modules. + +@item Location(s): The parser header file and the parser source code file after +the Bison-generated @code{YYSTYPE}, @code{YYLTYPE}, and token definitions. +@end itemize + +@item top +@findex %code top + +@itemize @bullet +@item Language(s): C, C++ + +@item Purpose: The unqualified @code{%code} or @code{%code requires} should +usually be more appropriate than @code{%code top}. +However, occasionally it is necessary to insert code much nearer the top of the +parser source code file. +For example: + +@smallexample +%code top @{ + #define _GNU_SOURCE + #include +@} +@end smallexample + +@item Location(s): Near the top of the parser source code file. +@end itemize + +@item imports +@findex %code imports + +@itemize @bullet +@item Language(s): Java + +@item Purpose: This is the best place to write Java import directives. + +@item Location(s): The parser Java file after any Java package directive and +before any class definitions. +@end itemize +@end itemize + +@cindex Prologue +For a detailed discussion of how to use @code{%code} in place of the +traditional Yacc prologue for C/C++, see @ref{Prologue Alternatives}. +@end deffn + @deffn {Directive} %debug In the parser file, define the macro @code{YYDEBUG} to 1 if it is not already defined, so that the debugging facilities are compiled. -@end deffn @xref{Tracing, ,Tracing Your Parser}. +@end deffn + +@deffn {Directive} %define @var{variable} +@deffnx {Directive} %define @var{variable} "@var{value}" +Define a variable to adjust Bison's behavior. +The possible choices for @var{variable}, as well as their meanings, depend on +the selected target language and/or the parser skeleton (@pxref{Decl +Summary,,%language}, @pxref{Decl Summary,,%skeleton}). + +Bison will warn if a @var{variable} is defined multiple times. + +Omitting @code{"@var{value}"} is always equivalent to specifying it as +@code{""}. + +Some @var{variable}s may be used as Booleans. +In this case, Bison will complain if the variable definition does not meet one +of the following four conditions: + +@enumerate +@item @code{"@var{value}"} is @code{"true"} + +@item @code{"@var{value}"} is omitted (or is @code{""}). +This is equivalent to @code{"true"}. + +@item @code{"@var{value}"} is @code{"false"}. + +@item @var{variable} is never defined. +In this case, Bison selects a default value, which may depend on the selected +target language and/or parser skeleton. +@end enumerate + +Some of the accepted @var{variable}s are: + +@itemize @bullet +@item api.pure +@findex %define api.pure + +@itemize @bullet +@item Language(s): C + +@item Purpose: Request a pure (reentrant) parser program. +@xref{Pure Decl, ,A Pure (Reentrant) Parser}. + +@item Accepted Values: Boolean + +@item Default Value: @code{"false"} +@end itemize + +@item api.push_pull +@findex %define api.push_pull + +@itemize @bullet +@item Language(s): C (LALR(1) only) + +@item Purpose: Requests a pull parser, a push parser, or both. +@xref{Push Decl, ,A Push Parser}. +(The current push parsing interface is experimental and may evolve. +More user feedback will help to stabilize it.) + +@item Accepted Values: @code{"pull"}, @code{"push"}, @code{"both"} + +@item Default Value: @code{"pull"} +@end itemize + +@item lr.keep_unreachable_states +@findex %define lr.keep_unreachable_states + +@itemize @bullet +@item Language(s): all + +@item Purpose: Requests that Bison allow unreachable parser states to remain in +the parser tables. +Bison considers a state to be unreachable if there exists no sequence of +transitions from the start state to that state. +A state can become unreachable during conflict resolution if Bison disables a +shift action leading to it from a predecessor state. +Keeping unreachable states is sometimes useful for analysis purposes, but they +are useless in the generated parser. + +@item Accepted Values: Boolean + +@item Default Value: @code{"false"} + +@item Caveats: + +@itemize @bullet + +@item Unreachable states may contain conflicts and may use rules not used in +any other state. +Thus, keeping unreachable states may induce warnings that are irrelevant to +your parser's behavior, and it may eliminate warnings that are relevant. +Of course, the change in warnings may actually be relevant to a parser table +analysis that wants to keep unreachable states, so this behavior will likely +remain in future Bison releases. + +@item While Bison is able to remove unreachable states, it is not guaranteed to +remove other kinds of useless states. +Specifically, when Bison disables reduce actions during conflict resolution, +some goto actions may become useless, and thus some additional states may +become useless. +If Bison were to compute which goto actions were useless and then disable those +actions, it could identify such states as unreachable and then remove those +states. +However, Bison does not compute which goto actions are useless. +@end itemize +@end itemize + +@item namespace +@findex %define namespace + +@itemize +@item Languages(s): C++ + +@item Purpose: Specifies the namespace for the parser class. +For example, if you specify: + +@smallexample +%define namespace "foo::bar" +@end smallexample + +Bison uses @code{foo::bar} verbatim in references such as: + +@smallexample +foo::bar::parser::semantic_type +@end smallexample + +However, to open a namespace, Bison removes any leading @code{::} and then +splits on any remaining occurrences: + +@smallexample +namespace foo @{ namespace bar @{ + class position; + class location; +@} @} +@end smallexample + +@item Accepted Values: Any absolute or relative C++ namespace reference without +a trailing @code{"::"}. +For example, @code{"foo"} or @code{"::foo::bar"}. + +@item Default Value: The value specified by @code{%name-prefix}, which defaults +to @code{yy}. +This usage of @code{%name-prefix} is for backward compatibility and can be +confusing since @code{%name-prefix} also specifies the textual prefix for the +lexical analyzer function. +Thus, if you specify @code{%name-prefix}, it is best to also specify +@code{%define namespace} so that @code{%name-prefix} @emph{only} affects the +lexical analyzer function. +For example, if you specify: + +@smallexample +%define namespace "foo" +%name-prefix "bar::" +@end smallexample + +The parser namespace is @code{foo} and @code{yylex} is referenced as +@code{bar::lex}. +@end itemize +@end itemize + +@end deffn @deffn {Directive} %defines Write a header file containing macro definitions for the token type @@ -4319,11 +5022,15 @@ typically needs to be able to refer to the above-mentioned declarations and to the token type codes. @xref{Token Values, ,Semantic Values of Tokens}. -@findex %start-header -@findex %end-header -If you have declared @code{%start-header} or @code{%end-header}, the output +@findex %code requires +@findex %code provides +If you have declared @code{%code requires} or @code{%code provides}, the output header also contains their code. -@xref{Table of Symbols, ,%start-header}. +@xref{Decl Summary, ,%code}. +@end deffn + +@deffn {Directive} %defines @var{defines-file} +Same as above, but save in the file @var{defines-file}. @end deffn @deffn {Directive} %destructor @@ -4331,11 +5038,20 @@ Specify how the parser should reclaim the memory associated to discarded symbols. @xref{Destructor Decl, , Freeing Discarded Symbols}. @end deffn -@deffn {Directive} %file-prefix="@var{prefix}" +@deffn {Directive} %file-prefix "@var{prefix}" Specify a prefix to use for all Bison output file names. The names are chosen as if the input file were named @file{@var{prefix}.y}. @end deffn +@deffn {Directive} %language "@var{language}" +Specify the programming language for the generated parser. Currently +supported languages include C, C++, and Java. +@var{language} is case-insensitive. + +This directive is experimental and its effect may be modified in future +releases. +@end deffn + @deffn {Directive} %locations Generate the code processing the locations (@pxref{Action Features, ,Special Features for Use in Actions}). This mode is enabled as soon as @@ -4344,16 +5060,19 @@ grammar does not use it, using @samp{%locations} allows for more accurate syntax error messages. @end deffn -@deffn {Directive} %name-prefix="@var{prefix}" +@deffn {Directive} %name-prefix "@var{prefix}" Rename the external symbols used in the parser so that they start with @var{prefix} instead of @samp{yy}. The precise list of symbols renamed in C parsers is @code{yyparse}, @code{yylex}, @code{yyerror}, @code{yynerrs}, @code{yylval}, @code{yychar}, @code{yydebug}, and -(if locations are used) @code{yylloc}. For example, if you use -@samp{%name-prefix="c_"}, the names become @code{c_parse}, @code{c_lex}, -and so on. In C++ parsers, it is only the surrounding namespace which is -named @var{prefix} instead of @samp{yy}. +(if locations are used) @code{yylloc}. If you use a push parser, +@code{yypush_parse}, @code{yypull_parse}, @code{yypstate}, +@code{yypstate_new} and @code{yypstate_delete} will +also be renamed. For example, if you use @samp{%name-prefix "c_"}, the +names become @code{c_parse}, @code{c_lex}, and so on. +For C++ parsers, see the @code{%define namespace} documentation in this +section. @xref{Multiple Parsers, ,Multiple Parsers in the Same Program}. @end deffn @@ -4365,16 +5084,6 @@ Precedence}). @end deffn @end ifset -@deffn {Directive} %no-parser -Do not include any C code in the parser file; generate tables only. The -parser file contains just @code{#define} directives and static variable -declarations. - -This option also tells Bison to write the C code for the grammar actions -into a file named @file{@var{file}.act}, in the form of a -brace-surrounded body fit for a @code{switch} statement. -@end deffn - @deffn {Directive} %no-lines Don't generate any @code{#line} preprocessor commands in the parser file. Ordinarily Bison writes these commands in the parser file so that @@ -4384,13 +5093,13 @@ associate errors with the parser file, treating it an independent source file in its own right. @end deffn -@deffn {Directive} %output="@var{file}" +@deffn {Directive} %output "@var{file}" Specify @var{file} for the parser file. @end deffn @deffn {Directive} %pure-parser -Request a pure (reentrant) parser program (@pxref{Pure Decl, ,A Pure -(Reentrant) Parser}). +Deprecated version of @code{%define api.pure} (@pxref{Decl Summary, ,%define}), +for which Bison is more careful to warn about unreasonable usage. @end deffn @deffn {Directive} %require "@var{version}" @@ -4398,6 +5107,21 @@ Require version @var{version} or higher of Bison. @xref{Require Decl, , Require a Version of Bison}. @end deffn +@deffn {Directive} %skeleton "@var{file}" +Specify the skeleton to use. + +@c You probably don't need this option unless you are developing Bison. +@c You should use @code{%language} if you want to specify the skeleton for a +@c different language, because it is clearer and because it will always choose the +@c correct skeleton for non-deterministic or push parsers. + +If @var{file} does not contain a @code{/}, @var{file} is the name of a skeleton +file in the Bison installation directory. +If it does, @var{file} is an absolute file name or a file name relative to the +directory of the grammar file. +This is similar to how most shells resolve commands. +@end deffn + @deffn {Directive} %token-table Generate an array of token names in the parser file. The name of the array is @code{yytname}; @code{yytname[@var{i}]} is the name of the @@ -4461,8 +5185,11 @@ names that do not conflict. The precise list of symbols renamed is @code{yyparse}, @code{yylex}, @code{yyerror}, @code{yynerrs}, @code{yylval}, @code{yylloc}, -@code{yychar} and @code{yydebug}. For example, if you use @samp{-p c}, -the names become @code{cparse}, @code{clex}, and so on. +@code{yychar} and @code{yydebug}. If you use a push parser, +@code{yypush_parse}, @code{yypull_parse}, @code{yypstate}, +@code{yypstate_new} and @code{yypstate_delete} will also be renamed. +For example, if you use @samp{-p c}, the names become @code{cparse}, +@code{clex}, and so on. @strong{All the other variables and macros associated with Bison are not renamed.} These others are not global; there is no conflict if the same @@ -4490,13 +5217,17 @@ identifier (aside from those in this manual) in an action or in epilogue in the grammar file, you are likely to run into trouble. @menu -* Parser Function:: How to call @code{yyparse} and what it returns. -* Lexical:: You must supply a function @code{yylex} - which reads tokens. -* Error Reporting:: You must supply a function @code{yyerror}. -* Action Features:: Special features for use in actions. -* Internationalization:: How to let the parser speak in the user's - native language. +* Parser Function:: How to call @code{yyparse} and what it returns. +* Push Parser Function:: How to call @code{yypush_parse} and what it returns. +* Pull Parser Function:: How to call @code{yypull_parse} and what it returns. +* Parser Create Function:: How to call @code{yypstate_new} and what it returns. +* Parser Delete Function:: How to call @code{yypstate_delete} and what it returns. +* Lexical:: You must supply a function @code{yylex} + which reads tokens. +* Error Reporting:: You must supply a function @code{yyerror}. +* Action Features:: Special features for use in actions. +* Internationalization:: How to let the parser speak in the user's + native language. @end menu @node Parser Function @@ -4566,13 +5297,82 @@ Then call the parser like this: @} @end example -@noindent -In the grammar actions, use expressions like this to refer to the data: +@noindent +In the grammar actions, use expressions like this to refer to the data: + +@example +exp: @dots{} @{ @dots{}; *randomness += 1; @dots{} @} +@end example + +@node Push Parser Function +@section The Push Parser Function @code{yypush_parse} +@findex yypush_parse + +(The current push parsing interface is experimental and may evolve. +More user feedback will help to stabilize it.) + +You call the function @code{yypush_parse} to parse a single token. This +function is available if either the @code{%define api.push_pull "push"} or +@code{%define api.push_pull "both"} declaration is used. +@xref{Push Decl, ,A Push Parser}. + +@deftypefun int yypush_parse (yypstate *yyps) +The value returned by @code{yypush_parse} is the same as for yyparse with the +following exception. @code{yypush_parse} will return YYPUSH_MORE if more input +is required to finish parsing the grammar. +@end deftypefun + +@node Pull Parser Function +@section The Pull Parser Function @code{yypull_parse} +@findex yypull_parse + +(The current push parsing interface is experimental and may evolve. +More user feedback will help to stabilize it.) + +You call the function @code{yypull_parse} to parse the rest of the input +stream. This function is available if the @code{%define api.push_pull "both"} +declaration is used. +@xref{Push Decl, ,A Push Parser}. + +@deftypefun int yypull_parse (yypstate *yyps) +The value returned by @code{yypull_parse} is the same as for @code{yyparse}. +@end deftypefun + +@node Parser Create Function +@section The Parser Create Function @code{yystate_new} +@findex yypstate_new + +(The current push parsing interface is experimental and may evolve. +More user feedback will help to stabilize it.) + +You call the function @code{yypstate_new} to create a new parser instance. +This function is available if either the @code{%define api.push_pull "push"} or +@code{%define api.push_pull "both"} declaration is used. +@xref{Push Decl, ,A Push Parser}. -@example -exp: @dots{} @{ @dots{}; *randomness += 1; @dots{} @} -@end example +@deftypefun yypstate *yypstate_new (void) +The function will return a valid parser instance if there was memory available +or 0 if no memory was available. +In impure mode, it will also return 0 if a parser instance is currently +allocated. +@end deftypefun + +@node Parser Delete Function +@section The Parser Delete Function @code{yystate_delete} +@findex yypstate_delete + +(The current push parsing interface is experimental and may evolve. +More user feedback will help to stabilize it.) +You call the function @code{yypstate_delete} to delete a parser instance. +function is available if either the @code{%define api.push_pull "push"} or +@code{%define api.push_pull "both"} declaration is used. +@xref{Push Decl, ,A Push Parser}. + +@deftypefun void yypstate_delete (yypstate *yyps) +This function will reclaim the memory associated with a parser instance. +After this call, you should no longer attempt to use the parser instance. +@end deftypefun @node Lexical @section The Lexical Analyzer Function @code{yylex} @@ -4594,13 +5394,13 @@ that need it. @xref{Invocation, ,Invoking Bison}. @menu * Calling Convention:: How @code{yyparse} calls @code{yylex}. -* Token Values:: How @code{yylex} must return the semantic value - of the token it has read. -* Token Locations:: How @code{yylex} must return the text location - (line number, etc.) of the token, if the - actions want that. -* Pure Calling:: How the calling convention differs - in a pure parser (@pxref{Pure Decl, ,A Pure (Reentrant) Parser}). +* Token Values:: How @code{yylex} must return the semantic value + of the token it has read. +* Token Locations:: How @code{yylex} must return the text location + (line number, etc.) of the token, if the + actions want that. +* Pure Calling:: How the calling convention differs in a pure parser + (@pxref{Pure Decl, ,A Pure (Reentrant) Parser}). @end menu @node Calling Convention @@ -4754,7 +5554,7 @@ The data type of @code{yylloc} has the name @code{YYLTYPE}. @node Pure Calling @subsection Calling Conventions for Pure Parsers -When you use the Bison declaration @code{%pure-parser} to request a +When you use the Bison declaration @code{%define api.pure} to request a pure, reentrant parser, the global communication variables @code{yylval} and @code{yylloc} cannot be used. (@xref{Pure Decl, ,A Pure (Reentrant) Parser}.) In such parsers the two global variables are replaced by @@ -4805,7 +5605,7 @@ int yylex (int *nastiness); int yyparse (int *nastiness, int *randomness); @end example -If @code{%pure-parser} is added: +If @code{%define api.pure} is added: @example int yylex (YYSTYPE *lvalp, int *nastiness); @@ -4813,7 +5613,7 @@ int yyparse (int *nastiness, int *randomness); @end example @noindent -and finally, if both @code{%pure-parser} and @code{%locations} are used: +and finally, if both @code{%define api.pure} and @code{%locations} are used: @example int yylex (YYSTYPE *lvalp, YYLTYPE *llocp, int *nastiness); @@ -4879,7 +5679,7 @@ Obviously, in location tracking pure parsers, @code{yyerror} should have an access to the current location. This is indeed the case for the @acronym{GLR} parsers, but not for the Yacc parser, for historical reasons. I.e., if -@samp{%locations %pure-parser} is passed then the prototypes for +@samp{%locations %define api.pure} is passed then the prototypes for @code{yyerror} are: @example @@ -4897,13 +5697,14 @@ void yyerror (int *nastiness, char const *msg); /* GLR parsers. */ Finally, @acronym{GLR} and Yacc parsers share the same @code{yyerror} calling convention for absolutely pure parsers, i.e., when the calling convention of @code{yylex} @emph{and} the calling convention of -@code{%pure-parser} are pure. I.e.: +@code{%define api.pure} are pure. +I.e.: @example /* Location tracking. */ %locations /* Pure yylex. */ -%pure-parser +%define api.pure %lex-param @{int *nastiness@} /* Pure yyparse. */ %parse-param @{int *nastiness@} @@ -5242,7 +6043,7 @@ This kind of parser is known in the literature as a bottom-up parser. * Contextual Precedence:: When an operator's precedence depends on context. * Parser States:: The parser is a finite-state-machine with stack. * Reduce/Reduce:: When two rules are applicable in the same situation. -* Mystery Conflicts:: Reduce/reduce conflicts that look unjustified. +* Mystery Conflicts:: Reduce/reduce conflicts that look unjustified. * Generalized LR Parsing:: Parsing arbitrary context-free grammars. * Memory Management:: What happens when memory is exhausted. How to avoid it. @end menu @@ -6007,7 +6808,7 @@ that allows variable-length arrays. The default is 200. Do not allow @code{YYINITDEPTH} to be greater than @code{YYMAXDEPTH}. @c FIXME: C++ output. -Because of semantical differences between C and C++, the +Because of semantic differences between C and C++, the @acronym{LALR}(1) parsers in C produced by Bison cannot grow when compiled by C++ compilers. In this precise case (compiling a C parser as C++) you are suggested to grow @code{YYINITDEPTH}. The Bison maintainers hope to fix @@ -6366,8 +7167,7 @@ As documented elsewhere (@pxref{Algorithm, ,The Bison Parser Algorithm}) Bison parsers are @dfn{shift/reduce automata}. In some cases (much more frequent than one would hope), looking at this automaton is required to tune or simply fix a parser. Bison provides two different -representation of it, either textually or graphically (as a @acronym{VCG} -file). +representation of it, either textually or graphically (as a DOT file). The textual file is generated when the options @option{--report} or @option{--verbose} are specified, see @xref{Invocation, , Invoking @@ -6397,9 +7197,9 @@ useless: STR; @command{bison} reports: @example -calc.y: warning: 1 useless nonterminal and 1 useless rule -calc.y:11.1-7: warning: useless nonterminal: useless -calc.y:11.10-12: warning: useless rule: useless: STR +calc.y: warning: 1 nonterminal and 1 rule useless in grammar +calc.y:11.1-7: warning: nonterminal useless in grammar: useless +calc.y:11.10-12: warning: rule useless in grammar: useless: STR calc.y: conflicts: 7 shift/reduce @end example @@ -6438,17 +7238,17 @@ State 11 conflicts: 4 shift/reduce The next section reports useless tokens, nonterminal and rules. Useless nonterminals and rules are removed in order to produce a smaller parser, but useless tokens are preserved, since they might be used by the -scanner (note the difference between ``useless'' and ``not used'' +scanner (note the difference between ``useless'' and ``unused'' below): @example -Useless nonterminals: +Nonterminals useless in grammar: useless -Terminals which are not used: +Terminals unused in grammar: STR -Useless rules: +Rules useless in grammar: #6 useless: STR; @end example @@ -6686,7 +7486,7 @@ with some set of possible lookahead tokens. When run with @example state 8 - exp -> exp . '+' exp [$, '+', '-', '/'] (rule 1) + exp -> exp . '+' exp (rule 1) exp -> exp '+' exp . [$, '+', '-', '/'] (rule 1) exp -> exp . '-' exp (rule 2) exp -> exp . '*' exp (rule 3) @@ -6795,7 +7595,7 @@ always possible. The trace facility outputs messages with macro calls of the form @code{YYFPRINTF (stderr, @var{format}, @var{args})} where -@var{format} and @var{args} are the usual @code{printf} format and +@var{format} and @var{args} are the usual @code{printf} format and variadic arguments. If you define @code{YYDEBUG} to a nonzero value but do not define @code{YYFPRINTF}, @code{} is automatically included and @code{YYFPRINTF} is defined to @code{fprintf}. @@ -6847,7 +7647,7 @@ standard I/O stream, the numeric code for the token type, and the token value (from @code{yylval}). Here is an example of @code{YYPRINT} suitable for the multi-function -calculator (@pxref{Mfcalc Decl, ,Declarations for @code{mfcalc}}): +calculator (@pxref{Mfcalc Declarations, ,Declarations for @code{mfcalc}}): @smallexample %@{ @@ -6948,6 +7748,9 @@ Print the version number of Bison and exit. @item --print-localedir Print the name of the directory containing locale-dependent data. +@item --print-datadir +Print the name of the directory containing skeletons and XSLT. + @item -y @itemx --yacc Act more like the traditional Yacc command. This can cause @@ -6972,29 +7775,74 @@ traditional Yacc grammars. If your grammar uses a Bison extension like @samp{%glr-parser}, Bison might not be Yacc-compatible even if this option is specified. +@item -W [@var{category}] +@itemx --warnings[=@var{category}] +Output warnings falling in @var{category}. @var{category} can be one +of: +@table @code +@item midrule-values +Warn about mid-rule values that are set but not used within any of the actions +of the parent rule. +For example, warn about unused @code{$2} in: + +@example +exp: '1' @{ $$ = 1; @} '+' exp @{ $$ = $1 + $4; @}; +@end example + +Also warn about mid-rule values that are used but not set. +For example, warn about unset @code{$$} in the mid-rule action in: + +@example + exp: '1' @{ $1 = 1; @} '+' exp @{ $$ = $2 + $4; @}; +@end example + +These warnings are not enabled by default since they sometimes prove to +be false alarms in existing grammars employing the Yacc constructs +@code{$0} or @code{$-@var{n}} (where @var{n} is some positive integer). + + +@item yacc +Incompatibilities with @acronym{POSIX} Yacc. + +@item all +All the warnings. +@item none +Turn off all the warnings. +@item error +Treat warnings as errors. +@end table + +A category can be turned off by prefixing its name with @samp{no-}. For +instance, @option{-Wno-syntax} will hide the warnings about unused +variables. @end table @noindent Tuning the parser: @table @option -@item -S @var{file} -@itemx --skeleton=@var{file} -Specify the skeleton to use. You probably don't need this option unless -you are developing Bison. - @item -t @itemx --debug In the parser file, define the macro @code{YYDEBUG} to 1 if it is not already defined, so that the debugging facilities are compiled. @xref{Tracing, ,Tracing Your Parser}. +@item -L @var{language} +@itemx --language=@var{language} +Specify the programming language for the generated parser, as if +@code{%language} was specified (@pxref{Decl Summary, , Bison Declaration +Summary}). Currently supported languages include C, C++, and Java. +@var{language} is case-insensitive. + +This option is experimental and its effect may be modified in future +releases. + @item --locations Pretend that @code{%locations} was specified. @xref{Decl Summary}. @item -p @var{prefix} @itemx --name-prefix=@var{prefix} -Pretend that @code{%name-prefix="@var{prefix}"} was specified. +Pretend that @code{%name-prefix "@var{prefix}"} was specified. @xref{Decl Summary}. @item -l @@ -7005,9 +7853,21 @@ and debuggers will associate errors with your source file, the grammar file. This option causes them to associate errors with the parser file, treating it as an independent source file in its own right. -@item -n -@itemx --no-parser -Pretend that @code{%no-parser} was specified. @xref{Decl Summary}. +@item -S @var{file} +@itemx --skeleton=@var{file} +Specify the skeleton to use, similar to @code{%skeleton} +(@pxref{Decl Summary, , Bison Declaration Summary}). + +@c You probably don't need this option unless you are developing Bison. +@c You should use @option{--language} if you want to specify the skeleton for a +@c different language, because it is clearer and because it will always +@c choose the correct skeleton for non-deterministic or push parsers. + +If @var{file} does not contain a @code{/}, @var{file} is the name of a skeleton +file in the Bison installation directory. +If it does, @var{file} is an absolute file name or a file name relative to the +current working directory. +This is similar to how most shells resolve commands. @item -k @itemx --token-table @@ -7018,14 +7878,15 @@ Pretend that @code{%token-table} was specified. @xref{Decl Summary}. Adjust the output: @table @option -@item -d -@itemx --defines +@item --defines[=@var{file}] Pretend that @code{%defines} was specified, i.e., write an extra output file containing macro definitions for the token type names defined in the grammar, as well as a few other declarations. @xref{Decl Summary}. -@item --defines=@var{defines-file} -Same as above, but save in the file @var{defines-file}. +@item -d +This is the same as @code{--defines} except @code{-d} does not accept a +@var{file} argument since POSIX Yacc requires that @code{-d} can be bundled +with other short options. @item -b @var{file-prefix} @itemx --file-prefix=@var{prefix} @@ -7051,6 +7912,9 @@ Implies @code{state} and augments the description of the automaton with the full set of items for each state, instead of its core only. @end table +@item --report-file=@var{file} +Specify the @var{file} for the verbose description. + @item -v @itemx --verbose Pretend that @code{%verbose} was specified, i.e., write an extra output @@ -7064,16 +7928,23 @@ Specify the @var{file} for the parser file. The other output files' names are constructed from @var{file} as described under the @samp{-v} and @samp{-d} options. -@item -g -Output a @acronym{VCG} definition of the @acronym{LALR}(1) grammar -automaton computed by Bison. If the grammar file is @file{foo.y}, the -@acronym{VCG} output file will -be @file{foo.vcg}. - -@item --graph=@var{graph-file} -The behavior of @var{--graph} is the same than @samp{-g}. The only -difference is that it has an optional argument which is the name of -the output graph file. +@item -g[@var{file}] +@itemx --graph[=@var{file}] +Output a graphical representation of the @acronym{LALR}(1) grammar +automaton computed by Bison, in @uref{http://www.graphviz.org/, Graphviz} +@uref{http://www.graphviz.org/doc/info/lang.html, @acronym{DOT}} format. +@code{@var{file}} is optional. +If omitted and the grammar file is @file{foo.y}, the output file will be +@file{foo.dot}. + +@item -x[@var{file}] +@itemx --xml[=@var{file}] +Output an XML report of the @acronym{LALR}(1) automaton computed by Bison. +@code{@var{file}} is optional. +If omitted and the grammar file is @file{foo.y}, the output file will be +@file{foo.xml}. +(The current XML schema is experimental and may evolve. +More user feedback will help to stabilize it.) @end table @node Option Cross Key @@ -7085,20 +7956,7 @@ the corresponding short option. @multitable {@option{--defines=@var{defines-file}}} {@option{-b @var{file-prefix}XXX}} @headitem Long Option @tab Short Option -@item @option{--debug} @tab @option{-t} -@item @option{--defines=@var{defines-file}} @tab @option{-d} -@item @option{--file-prefix=@var{prefix}} @tab @option{-b @var{file-prefix}} -@item @option{--graph=@var{graph-file}} @tab @option{-d} -@item @option{--help} @tab @option{-h} -@item @option{--name-prefix=@var{prefix}} @tab @option{-p @var{name-prefix}} -@item @option{--no-lines} @tab @option{-l} -@item @option{--no-parser} @tab @option{-n} -@item @option{--output=@var{outfile}} @tab @option{-o @var{outfile}} -@item @option{--print-localedir} @tab -@item @option{--token-table} @tab @option{-k} -@item @option{--verbose} @tab @option{-v} -@item @option{--version} @tab @option{-V} -@item @option{--yacc} @tab @option{-y} +@include cross-options.texi @end multitable @node Yacc Library @@ -7129,12 +7987,12 @@ int yyparse (void); @c ================================================= C++ Bison -@node C++ Language Interface -@chapter C++ Language Interface +@node Other Languages +@chapter Parsers Written In Other Languages @menu * C++ Parsers:: The interface to generate C++ parser classes -* A Complete C++ Example:: Demonstrating their use +* Java Parsers:: The interface to generate Java parser classes @end menu @node C++ Parsers @@ -7146,6 +8004,7 @@ int yyparse (void); * C++ Location Values:: The position and location classes * C++ Parser Interface:: Instantiating and running the parser * C++ Scanner Interface:: Exchanges between yylex and parse +* A Complete C++ Example:: Demonstrating their use @end menu @node C++ Bison Interface @@ -7154,13 +8013,17 @@ int yyparse (void); @c - Always pure @c - initial action -The C++ parser @acronym{LALR}(1) skeleton is named @file{lalr1.cc}. To -select it, you may either pass the option @option{--skeleton=lalr1.cc} -to Bison, or include the directive @samp{%skeleton "lalr1.cc"} in the -grammar preamble. When run, @command{bison} will create several -entities in the @samp{yy} namespace. Use the @samp{%name-prefix} -directive to change the namespace name, see @ref{Decl Summary}. The -various classes are generated in the following files: +The C++ @acronym{LALR}(1) parser is selected using the skeleton directive, +@samp{%skeleton "lalr1.cc"}, or the synonymous command-line option +@option{--skeleton=lalr1.cc}. +@xref{Decl Summary}. + +When run, @command{bison} will create several entities in the @samp{yy} +namespace. +@findex %define namespace +Use the @samp{%define namespace} directive to change the namespace name, see +@ref{Decl Summary}. +The various classes are generated in the following files: @table @file @item position.hh @@ -7189,7 +8052,7 @@ for a complete and accurate documentation. @node C++ Semantic Values @subsection C++ Semantic Values @c - No objects in unions -@c - YSTYPE +@c - YYSTYPE @c - Printer and destructor The @code{%union} directive works as for C, see @ref{Union Decl, ,The @@ -7219,7 +8082,7 @@ Symbols}. @c - %locations @c - class Position @c - class Location -@c - %define "filename_type" "const symbol::Symbol" +@c - %define filename_type "const symbol::Symbol" When the directive @code{%locations} is used, the C++ parser supports location tracking, see @ref{Locations, , Locations Overview}. Two @@ -7231,7 +8094,7 @@ and a @code{location}, a range composed of a pair of The name of the file. It will always be handled as a pointer, the parser will never duplicate nor deallocate it. As an experimental feature you may change it to @samp{@var{type}*} using @samp{%define -"filename_type" "@var{type}"}. +filename_type "@var{type}"}. @end deftypemethod @deftypemethod {position} {unsigned int} line @@ -7295,17 +8158,25 @@ Move @code{begin} onto @code{end}. The output files @file{@var{output}.hh} and @file{@var{output}.cc} declare and define the parser class in the namespace @code{yy}. The class name defaults to @code{parser}, but may be changed using -@samp{%define "parser_class_name" "@var{name}"}. The interface of +@samp{%define parser_class_name "@var{name}"}. The interface of this class is detailed below. It can be extended using the @code{%parse-param} feature: its semantics is slightly changed since it describes an additional member of the parser class, and an additional argument for its constructor. -@defcv {Type} {parser} {semantic_value_type} -@defcvx {Type} {parser} {location_value_type} +@defcv {Type} {parser} {semantic_type} +@defcvx {Type} {parser} {location_type} The types for semantics value and locations. @end defcv +@defcv {Type} {parser} {token} +A structure that contains (only) the definition of the tokens as the +@code{yytokentype} enumeration. To refer to the token @code{FOO}, the +scanner should use @code{yy::parser::token::FOO}. The scanner can use +@samp{typedef yy::parser::token token;} to ``import'' the token enumeration +(@pxref{Calc++ Scanner}). +@end defcv + @deftypemethod {parser} {} parser (@var{type1} @var{arg1}, ...) Build a new parser object. There are no arguments by default, unless @samp{%parse-param @{@var{type1} @var{arg1}@}} was used. @@ -7342,9 +8213,9 @@ described by @var{m}. The parser invokes the scanner by calling @code{yylex}. Contrary to C parsers, C++ parsers are always pure: there is no point in using the -@code{%pure-parser} directive. Therefore the interface is as follows. +@code{%define api.pure} directive. Therefore the interface is as follows. -@deftypemethod {parser} {int} yylex (semantic_value_type& @var{yylval}, location_type& @var{yylloc}, @var{type1} @var{arg1}, ...) +@deftypemethod {parser} {int} yylex (semantic_type* @var{yylval}, location_type* @var{yylloc}, @var{type1} @var{arg1}, ...) Return the next token. Its type is the return value, its semantic value and location being @var{yylval} and @var{yylloc}. Invocations of @samp{%lex-param @{@var{type1} @var{arg1}@}} yield additional arguments. @@ -7352,7 +8223,7 @@ value and location being @var{yylval} and @var{yylloc}. Invocations of @node A Complete C++ Example -@section A Complete C++ Example +@subsection A Complete C++ Example This section demonstrates the use of a C++ parser with a simple but complete example. This example should be available on your system, @@ -7372,7 +8243,7 @@ actually easier to interface with. @end menu @node Calc++ --- C++ Calculator -@subsection Calc++ --- C++ Calculator +@subsubsection Calc++ --- C++ Calculator Of course the grammar is dedicated to arithmetics, a single expression, possibly preceded by variable assignments. An @@ -7387,7 +8258,7 @@ seven * seven @end example @node Calc++ Parsing Driver -@subsection Calc++ Parsing Driver +@subsubsection Calc++ Parsing Driver @c - An env @c - A place to store error messages @c - A place for the result @@ -7423,8 +8294,8 @@ factor both as follows. @comment file: calc++-driver.hh @example -// Announce to Flex the prototype we want for lexing function, ... -# define YY_DECL \ +// Tell Flex the lexer's prototype ... +# define YY_DECL \ yy::calcxx_parser::token_type \ yylex (yy::calcxx_parser::semantic_type* yylval, \ yy::calcxx_parser::location_type* yylloc, \ @@ -7468,8 +8339,8 @@ Similarly for the parser itself. @comment file: calc++-driver.hh @example - // Handling the parser. - void parse (const std::string& f); + // Run the parser. Return 0 on success. + int parse (const std::string& f); std::string file; bool trace_parsing; @end example @@ -7510,15 +8381,16 @@ calcxx_driver::~calcxx_driver () @{ @} -void +int calcxx_driver::parse (const std::string &f) @{ file = f; scan_begin (); yy::calcxx_parser parser (*this); parser.set_debug_level (trace_parsing); - parser.parse (); + int res = parser.parse (); scan_end (); + return res; @} void @@ -7535,7 +8407,7 @@ calcxx_driver::error (const std::string& m) @end example @node Calc++ Parser -@subsection Calc++ Parser +@subsubsection Calc++ Parser The parser definition file @file{calc++-parser.yy} starts by asking for the C++ LALR(1) skeleton, the creation of the parser header file, and @@ -7546,24 +8418,24 @@ the grammar for. @comment file: calc++-parser.yy @example %skeleton "lalr1.cc" /* -*- C++ -*- */ -%require "2.1a" +%require "@value{VERSION}" %defines -%define "parser_class_name" "calcxx_parser" +%define parser_class_name "calcxx_parser" @end example @noindent -@findex %start-header +@findex %code requires Then come the declarations/inclusions needed to define the @code{%union}. Because the parser uses the parsing driver and reciprocally, both cannot include the header of the other. Because the driver's header needs detailed knowledge about the parser class (in particular its inner types), it is the parser's header which will simply use a forward declaration of the driver. -@xref{Table of Symbols, ,%start-header}. +@xref{Decl Summary, ,%code}. @comment file: calc++-parser.yy @example -%start-header @{ +%code requires @{ # include class calcxx_driver; @} @@ -7583,7 +8455,7 @@ global variables. @noindent Then we request the location tracking feature, and initialize the -first location's file name. Afterwards new locations are computed +first location's file name. Afterward new locations are computed relatively to the previous locations: the file name will be automatically propagated. @@ -7622,13 +8494,13 @@ them. @end example @noindent -@findex %after-header -The code between @samp{%after-header @{} and @samp{@}} is output in the +@findex %code +The code between @samp{%code @{} and @samp{@}} is output in the @file{*.cc} file; it needs detailed knowledge about the driver. @comment file: calc++-parser.yy @example -%after-header @{ +%code @{ # include "calc++-driver.hh" @} @end example @@ -7647,7 +8519,7 @@ avoid name clashes. %token ASSIGN ":=" %token IDENTIFIER "identifier" %token NUMBER "number" -%type exp "expression" +%type exp @end example @noindent @@ -7660,7 +8532,7 @@ To enable memory deallocation during error recovery, use %printer @{ debug_stream () << *$$; @} "identifier" %destructor @{ delete $$; @} "identifier" -%printer @{ debug_stream () << $$; @} "number" "expression" +%printer @{ debug_stream () << $$; @} @end example @noindent @@ -7675,7 +8547,9 @@ unit: assignments exp @{ driver.result = $2; @}; assignments: assignments assignment @{@} | /* Nothing. */ @{@}; -assignment: "identifier" ":=" exp @{ driver.variables[*$1] = $3; @}; +assignment: + "identifier" ":=" exp + @{ driver.variables[*$1] = $3; delete $1; @}; %left '+' '-'; %left '*' '/'; @@ -7683,7 +8557,7 @@ exp: exp '+' exp @{ $$ = $1 + $3; @} | exp '-' exp @{ $$ = $1 - $3; @} | exp '*' exp @{ $$ = $1 * $3; @} | exp '/' exp @{ $$ = $1 / $3; @} - | "identifier" @{ $$ = driver.variables[*$1]; @} + | "identifier" @{ $$ = driver.variables[*$1]; delete $1; @} | "number" @{ $$ = $1; @}; %% @end example @@ -7703,7 +8577,7 @@ yy::calcxx_parser::error (const yy::calcxx_parser::location_type& l, @end example @node Calc++ Scanner -@subsection Calc++ Scanner +@subsubsection Calc++ Scanner The Flex scanner first includes the driver declaration, then the parser's to get the set of defined tokens. @@ -7712,8 +8586,8 @@ parser's to get the set of defined tokens. @example %@{ /* -*- C++ -*- */ # include -# include -# include +# include +# include # include # include "calc++-driver.hh" # include "calc++-parser.hh" @@ -7812,8 +8686,13 @@ void calcxx_driver::scan_begin () @{ yy_flex_debug = trace_scanning; - if (!(yyin = fopen (file.c_str (), "r"))) - error (std::string ("cannot open ") + file); + if (file == "-") + yyin = stdin; + else if (!(yyin = fopen (file.c_str (), "r"))) + @{ + error (std::string ("cannot open ") + file); + exit (1); + @} @} void @@ -7824,7 +8703,7 @@ calcxx_driver::scan_end () @end example @node Calc++ Top Level -@subsection Calc++ Top Level +@subsubsection Calc++ Top Level The top level file, @file{calc++.cc}, poses no problem. @@ -7842,14 +8721,578 @@ main (int argc, char *argv[]) driver.trace_parsing = true; else if (*argv == std::string ("-s")) driver.trace_scanning = true; - else - @{ - driver.parse (*argv); - std::cout << driver.result << std::endl; - @} + else if (!driver.parse (*argv)) + std::cout << driver.result << std::endl; @} @end example +@node Java Parsers +@section Java Parsers + +@menu +* Java Bison Interface:: Asking for Java parser generation +* Java Semantic Values:: %type and %token vs. Java +* Java Location Values:: The position and location classes +* Java Parser Interface:: Instantiating and running the parser +* Java Scanner Interface:: Specifying the scanner for the parser +* Java Action Features:: Special features for use in actions +* Java Differences:: Differences between C/C++ and Java Grammars +* Java Declarations Summary:: List of Bison declarations used with Java +@end menu + +@node Java Bison Interface +@subsection Java Bison Interface +@c - %language "Java" + +(The current Java interface is experimental and may evolve. +More user feedback will help to stabilize it.) + +The Java parser skeletons are selected using the @code{%language "Java"} +directive or the @option{-L java}/@option{--language=java} option. + +@c FIXME: Documented bug. +When generating a Java parser, @code{bison @var{basename}.y} will create +a single Java source file named @file{@var{basename}.java}. Using an +input file without a @file{.y} suffix is currently broken. The basename +of the output file can be changed by the @code{%file-prefix} directive +or the @option{-p}/@option{--name-prefix} option. The entire output file +name can be changed by the @code{%output} directive or the +@option{-o}/@option{--output} option. The output file contains a single +class for the parser. + +You can create documentation for generated parsers using Javadoc. + +Contrary to C parsers, Java parsers do not use global variables; the +state of the parser is always local to an instance of the parser class. +Therefore, all Java parsers are ``pure'', and the @code{%pure-parser} +and @code{%define api.pure} directives does not do anything when used in +Java. + +Push parsers are currently unsupported in Java and @code{%define +api.push_pull} have no effect. + +@acronym{GLR} parsers are currently unsupported in Java. Do not use the +@code{glr-parser} directive. + +No header file can be generated for Java parsers. Do not use the +@code{%defines} directive or the @option{-d}/@option{--defines} options. + +@c FIXME: Possible code change. +Currently, support for debugging and verbose errors are always compiled +in. Thus the @code{%debug} and @code{%token-table} directives and the +@option{-t}/@option{--debug} and @option{-k}/@option{--token-table} +options have no effect. This may change in the future to eliminate +unused code in the generated parser, so use @code{%debug} and +@code{%verbose-error} explicitly if needed. Also, in the future the +@code{%token-table} directive might enable a public interface to +access the token names and codes. + +@node Java Semantic Values +@subsection Java Semantic Values +@c - No %union, specify type in %type/%token. +@c - YYSTYPE +@c - Printer and destructor + +There is no @code{%union} directive in Java parsers. Instead, the +semantic values' types (class names) should be specified in the +@code{%type} or @code{%token} directive: + +@example +%type expr assignment_expr term factor +%type number +@end example + +By default, the semantic stack is declared to have @code{Object} members, +which means that the class types you specify can be of any class. +To improve the type safety of the parser, you can declare the common +superclass of all the semantic values using the @code{%define stype} +directive. For example, after the following declaration: + +@example +%define stype "ASTNode" +@end example + +@noindent +any @code{%type} or @code{%token} specifying a semantic type which +is not a subclass of ASTNode, will cause a compile-time error. + +@c FIXME: Documented bug. +Types used in the directives may be qualified with a package name. +Primitive data types are accepted for Java version 1.5 or later. Note +that in this case the autoboxing feature of Java 1.5 will be used. +Generic types may not be used; this is due to a limitation in the +implementation of Bison, and may change in future releases. + +Java parsers do not support @code{%destructor}, since the language +adopts garbage collection. The parser will try to hold references +to semantic values for as little time as needed. + +Java parsers do not support @code{%printer}, as @code{toString()} +can be used to print the semantic values. This however may change +(in a backwards-compatible way) in future versions of Bison. + + +@node Java Location Values +@subsection Java Location Values +@c - %locations +@c - class Position +@c - class Location + +When the directive @code{%locations} is used, the Java parser +supports location tracking, see @ref{Locations, , Locations Overview}. +An auxiliary user-defined class defines a @dfn{position}, a single point +in a file; Bison itself defines a class representing a @dfn{location}, +a range composed of a pair of positions (possibly spanning several +files). The location class is an inner class of the parser; the name +is @code{Location} by default, and may also be renamed using +@code{%define location_type "@var{class-name}}. + +The location class treats the position as a completely opaque value. +By default, the class name is @code{Position}, but this can be changed +with @code{%define position_type "@var{class-name}"}. This class must +be supplied by the user. + + +@deftypeivar {Location} {Position} begin +@deftypeivarx {Location} {Position} end +The first, inclusive, position of the range, and the first beyond. +@end deftypeivar + +@deftypeop {Constructor} {Location} {} Location (Position @var{loc}) +Create a @code{Location} denoting an empty range located at a given point. +@end deftypeop + +@deftypeop {Constructor} {Location} {} Location (Position @var{begin}, Position @var{end}) +Create a @code{Location} from the endpoints of the range. +@end deftypeop + +@deftypemethod {Location} {String} toString () +Prints the range represented by the location. For this to work +properly, the position class should override the @code{equals} and +@code{toString} methods appropriately. +@end deftypemethod + + +@node Java Parser Interface +@subsection Java Parser Interface +@c - define parser_class_name +@c - Ctor +@c - parse, error, set_debug_level, debug_level, set_debug_stream, +@c debug_stream. +@c - Reporting errors + +The name of the generated parser class defaults to @code{YYParser}. The +@code{YY} prefix may be changed using the @code{%name-prefix} directive +or the @option{-p}/@option{--name-prefix} option. Alternatively, use +@code{%define parser_class_name "@var{name}"} to give a custom name to +the class. The interface of this class is detailed below. + +By default, the parser class has package visibility. A declaration +@code{%define public} will change to public visibility. Remember that, +according to the Java language specification, the name of the @file{.java} +file should match the name of the class in this case. Similarly, you can +use @code{abstract}, @code{final} and @code{strictfp} with the +@code{%define} declaration to add other modifiers to the parser class. + +The Java package name of the parser class can be specified using the +@code{%define package} directive. The superclass and the implemented +interfaces of the parser class can be specified with the @code{%define +extends} and @code{%define implements} directives. + +The parser class defines an inner class, @code{Location}, that is used +for location tracking (see @ref{Java Location Values}), and a inner +interface, @code{Lexer} (see @ref{Java Scanner Interface}). Other than +these inner class/interface, and the members described in the interface +below, all the other members and fields are preceded with a @code{yy} or +@code{YY} prefix to avoid clashes with user code. + +@c FIXME: The following constants and variables are still undocumented: +@c @code{bisonVersion}, @code{bisonSkeleton} and @code{errorVerbose}. + +The parser class can be extended using the @code{%parse-param} +directive. Each occurrence of the directive will add a @code{protected +final} field to the parser class, and an argument to its constructor, +which initialize them automatically. + +Token names defined by @code{%token} and the predefined @code{EOF} token +name are added as constant fields to the parser class. + +@deftypeop {Constructor} {YYParser} {} YYParser (@var{lex_param}, @dots{}, @var{parse_param}, @dots{}) +Build a new parser object with embedded @code{%code lexer}. There are +no parameters, unless @code{%parse-param}s and/or @code{%lex-param}s are +used. +@end deftypeop + +@deftypeop {Constructor} {YYParser} {} YYParser (Lexer @var{lexer}, @var{parse_param}, @dots{}) +Build a new parser object using the specified scanner. There are no +additional parameters unless @code{%parse-param}s are used. + +If the scanner is defined by @code{%code lexer}, this constructor is +declared @code{protected} and is called automatically with a scanner +created with the correct @code{%lex-param}s. +@end deftypeop + +@deftypemethod {YYParser} {boolean} parse () +Run the syntactic analysis, and return @code{true} on success, +@code{false} otherwise. +@end deftypemethod + +@deftypemethod {YYParser} {boolean} recovering () +During the syntactic analysis, return @code{true} if recovering +from a syntax error. +@xref{Error Recovery}. +@end deftypemethod + +@deftypemethod {YYParser} {java.io.PrintStream} getDebugStream () +@deftypemethodx {YYParser} {void} setDebugStream (java.io.printStream @var{o}) +Get or set the stream used for tracing the parsing. It defaults to +@code{System.err}. +@end deftypemethod + +@deftypemethod {YYParser} {int} getDebugLevel () +@deftypemethodx {YYParser} {void} setDebugLevel (int @var{l}) +Get or set the tracing level. Currently its value is either 0, no trace, +or nonzero, full tracing. +@end deftypemethod + + +@node Java Scanner Interface +@subsection Java Scanner Interface +@c - %code lexer +@c - %lex-param +@c - Lexer interface + +There are two possible ways to interface a Bison-generated Java parser +with a scanner: the scanner may be defined by @code{%code lexer}, or +defined elsewhere. In either case, the scanner has to implement the +@code{Lexer} inner interface of the parser class. + +In the first case, the body of the scanner class is placed in +@code{%code lexer} blocks. If you want to pass parameters from the +parser constructor to the scanner constructor, specify them with +@code{%lex-param}; they are passed before @code{%parse-param}s to the +constructor. + +In the second case, the scanner has to implement the @code{Lexer} interface, +which is defined within the parser class (e.g., @code{YYParser.Lexer}). +The constructor of the parser object will then accept an object +implementing the interface; @code{%lex-param} is not used in this +case. + +In both cases, the scanner has to implement the following methods. + +@deftypemethod {Lexer} {void} yyerror (Location @var{loc}, String @var{msg}) +This method is defined by the user to emit an error message. The first +parameter is omitted if location tracking is not active. Its type can be +changed using @code{%define location_type "@var{class-name}".} +@end deftypemethod + +@deftypemethod {Lexer} {int} yylex () +Return the next token. Its type is the return value, its semantic +value and location are saved and returned by the their methods in the +interface. + +Use @code{%define lex_throws} to specify any uncaught exceptions. +Default is @code{java.io.IOException}. +@end deftypemethod + +@deftypemethod {Lexer} {Position} getStartPos () +@deftypemethodx {Lexer} {Position} getEndPos () +Return respectively the first position of the last token that +@code{yylex} returned, and the first position beyond it. These +methods are not needed unless location tracking is active. + +The return type can be changed using @code{%define position_type +"@var{class-name}".} +@end deftypemethod + +@deftypemethod {Lexer} {Object} getLVal () +Return the semantic value of the last token that yylex returned. + +The return type can be changed using @code{%define stype +"@var{class-name}".} +@end deftypemethod + + +@node Java Action Features +@subsection Special Features for Use in Java Actions + +The following special constructs can be uses in Java actions. +Other analogous C action features are currently unavailable for Java. + +Use @code{%define throws} to specify any uncaught exceptions from parser +actions, and initial actions specified by @code{%initial-action}. + +@defvar $@var{n} +The semantic value for the @var{n}th component of the current rule. +This may not be assigned to. +@xref{Java Semantic Values}. +@end defvar + +@defvar $<@var{typealt}>@var{n} +Like @code{$@var{n}} but specifies a alternative type @var{typealt}. +@xref{Java Semantic Values}. +@end defvar + +@defvar $$ +The semantic value for the grouping made by the current rule. As a +value, this is in the base type (@code{Object} or as specified by +@code{%define stype}) as in not cast to the declared subtype because +casts are not allowed on the left-hand side of Java assignments. +Use an explicit Java cast if the correct subtype is needed. +@xref{Java Semantic Values}. +@end defvar + +@defvar $<@var{typealt}>$ +Same as @code{$$} since Java always allow assigning to the base type. +Perhaps we should use this and @code{$<>$} for the value and @code{$$} +for setting the value but there is currently no easy way to distinguish +these constructs. +@xref{Java Semantic Values}. +@end defvar + +@defvar @@@var{n} +The location information of the @var{n}th component of the current rule. +This may not be assigned to. +@xref{Java Location Values}. +@end defvar + +@defvar @@$ +The location information of the grouping made by the current rule. +@xref{Java Location Values}. +@end defvar + +@deffn {Statement} {return YYABORT;} +Return immediately from the parser, indicating failure. +@xref{Java Parser Interface}. +@end deffn + +@deffn {Statement} {return YYACCEPT;} +Return immediately from the parser, indicating success. +@xref{Java Parser Interface}. +@end deffn + +@deffn {Statement} {return YYERROR;} +Start error recovery without printing an error message. +@xref{Error Recovery}. +@end deffn + +@deftypefn {Function} {boolean} recovering () +Return whether error recovery is being done. In this state, the parser +reads token until it reaches a known state, and then restarts normal +operation. +@xref{Error Recovery}. +@end deftypefn + +@deftypefn {Function} {protected void} yyerror (String msg) +@deftypefnx {Function} {protected void} yyerror (Position pos, String msg) +@deftypefnx {Function} {protected void} yyerror (Location loc, String msg) +Print an error message using the @code{yyerror} method of the scanner +instance in use. +@end deftypefn + + +@node Java Differences +@subsection Differences between C/C++ and Java Grammars + +The different structure of the Java language forces several differences +between C/C++ grammars, and grammars designed for Java parsers. This +section summarizes these differences. + +@itemize +@item +Java lacks a preprocessor, so the @code{YYERROR}, @code{YYACCEPT}, +@code{YYABORT} symbols (@pxref{Table of Symbols}) cannot obviously be +macros. Instead, they should be preceded by @code{return} when they +appear in an action. The actual definition of these symbols is +opaque to the Bison grammar, and it might change in the future. The +only meaningful operation that you can do, is to return them. +See @pxref{Java Action Features}. + +Note that of these three symbols, only @code{YYACCEPT} and +@code{YYABORT} will cause a return from the @code{yyparse} +method@footnote{Java parsers include the actions in a separate +method than @code{yyparse} in order to have an intuitive syntax that +corresponds to these C macros.}. + +@item +Java lacks unions, so @code{%union} has no effect. Instead, semantic +values have a common base type: @code{Object} or as specified by +@samp{%define stype}. Angle brackets on @code{%token}, @code{type}, +@code{$@var{n}} and @code{$$} specify subtypes rather than fields of +an union. The type of @code{$$}, even with angle brackets, is the base +type since Java casts are not allow on the left-hand side of assignments. +Also, @code{$@var{n}} and @code{@@@var{n}} are not allowed on the +left-hand side of assignments. See @pxref{Java Semantic Values} and +@pxref{Java Action Features}. + +@item +The prologue declarations have a different meaning than in C/C++ code. +@table @asis +@item @code{%code imports} +blocks are placed at the beginning of the Java source code. They may +include copyright notices. For a @code{package} declarations, it is +suggested to use @code{%define package} instead. + +@item unqualified @code{%code} +blocks are placed inside the parser class. + +@item @code{%code lexer} +blocks, if specified, should include the implementation of the +scanner. If there is no such block, the scanner can be any class +that implements the appropriate interface (see @pxref{Java Scanner +Interface}). +@end table + +Other @code{%code} blocks are not supported in Java parsers. +In particular, @code{%@{ @dots{} %@}} blocks should not be used +and may give an error in future versions of Bison. + +The epilogue has the same meaning as in C/C++ code and it can +be used to define other classes used by the parser @emph{outside} +the parser class. +@end itemize + + +@node Java Declarations Summary +@subsection Java Declarations Summary + +This summary only include declarations specific to Java or have special +meaning when used in a Java parser. + +@deffn {Directive} {%language "Java"} +Generate a Java class for the parser. +@end deffn + +@deffn {Directive} %lex-param @{@var{type} @var{name}@} +A parameter for the lexer class defined by @code{%code lexer} +@emph{only}, added as parameters to the lexer constructor and the parser +constructor that @emph{creates} a lexer. Default is none. +@xref{Java Scanner Interface}. +@end deffn + +@deffn {Directive} %name-prefix "@var{prefix}" +The prefix of the parser class name @code{@var{prefix}Parser} if +@code{%define parser_class_name} is not used. Default is @code{YY}. +@xref{Java Bison Interface}. +@end deffn + +@deffn {Directive} %parse-param @{@var{type} @var{name}@} +A parameter for the parser class added as parameters to constructor(s) +and as fields initialized by the constructor(s). Default is none. +@xref{Java Parser Interface}. +@end deffn + +@deffn {Directive} %token <@var{type}> @var{token} @dots{} +Declare tokens. Note that the angle brackets enclose a Java @emph{type}. +@xref{Java Semantic Values}. +@end deffn + +@deffn {Directive} %type <@var{type}> @var{nonterminal} @dots{} +Declare the type of nonterminals. Note that the angle brackets enclose +a Java @emph{type}. +@xref{Java Semantic Values}. +@end deffn + +@deffn {Directive} %code @{ @var{code} @dots{} @} +Code appended to the inside of the parser class. +@xref{Java Differences}. +@end deffn + +@deffn {Directive} {%code imports} @{ @var{code} @dots{} @} +Code inserted just after the @code{package} declaration. +@xref{Java Differences}. +@end deffn + +@deffn {Directive} {%code lexer} @{ @var{code} @dots{} @} +Code added to the body of a inner lexer class within the parser class. +@xref{Java Scanner Interface}. +@end deffn + +@deffn {Directive} %% @var{code} @dots{} +Code (after the second @code{%%}) appended to the end of the file, +@emph{outside} the parser class. +@xref{Java Differences}. +@end deffn + +@deffn {Directive} %@{ @var{code} @dots{} %@} +Not supported. Use @code{%code import} instead. +@xref{Java Differences}. +@end deffn + +@deffn {Directive} {%define abstract} +Whether the parser class is declared @code{abstract}. Default is false. +@xref{Java Bison Interface}. +@end deffn + +@deffn {Directive} {%define extends} "@var{superclass}" +The superclass of the parser class. Default is none. +@xref{Java Bison Interface}. +@end deffn + +@deffn {Directive} {%define final} +Whether the parser class is declared @code{final}. Default is false. +@xref{Java Bison Interface}. +@end deffn + +@deffn {Directive} {%define implements} "@var{interfaces}" +The implemented interfaces of the parser class, a comma-separated list. +Default is none. +@xref{Java Bison Interface}. +@end deffn + +@deffn {Directive} {%define lex_throws} "@var{exceptions}" +The exceptions thrown by the @code{yylex} method of the lexer, a +comma-separated list. Default is @code{java.io.IOException}. +@xref{Java Scanner Interface}. +@end deffn + +@deffn {Directive} {%define location_type} "@var{class}" +The name of the class used for locations (a range between two +positions). This class is generated as an inner class of the parser +class by @command{bison}. Default is @code{Location}. +@xref{Java Location Values}. +@end deffn + +@deffn {Directive} {%define package} "@var{package}" +The package to put the parser class in. Default is none. +@xref{Java Bison Interface}. +@end deffn + +@deffn {Directive} {%define parser_class_name} "@var{name}" +The name of the parser class. Default is @code{YYParser} or +@code{@var{name-prefix}Parser}. +@xref{Java Bison Interface}. +@end deffn + +@deffn {Directive} {%define position_type} "@var{class}" +The name of the class used for positions. This class must be supplied by +the user. Default is @code{Position}. +@xref{Java Location Values}. +@end deffn + +@deffn {Directive} {%define public} +Whether the parser class is declared @code{public}. Default is false. +@xref{Java Bison Interface}. +@end deffn + +@deffn {Directive} {%define stype} "@var{class}" +The base type of semantic values. Default is @code{Object}. +@xref{Java Semantic Values}. +@end deffn + +@deffn {Directive} {%define strictfp} +Whether the parser class is declared @code{strictfp}. Default is false. +@xref{Java Bison Interface}. +@end deffn + +@deffn {Directive} {%define throws} "@var{exceptions}" +The exceptions thrown by user-supplied parser actions and +@code{%initial-action}, a comma-separated list. Default is none. +@xref{Java Parser Interface}. +@end deffn + + @c ================================================= FAQ @node FAQ @@ -7870,7 +9313,7 @@ are addressed. * I can't build Bison:: Troubleshooting * Where can I find help?:: Troubleshouting * Bug Reports:: Troublereporting -* Other Languages:: Parsers in Java and others +* More Languages:: Parsers in C++, Java, and so on * Beta Testing:: Experimenting development versions * Mailing Lists:: Meeting other Bison users @end menu @@ -7904,7 +9347,7 @@ or @display My parser includes support for an @samp{#include}-like feature, in which case I run @code{yyparse} from @code{yyparse}. This fails -although I did specify I needed a @code{%pure-parser}. +although I did specify @code{%define api.pure}. @end display These problems typically come not from Bison itself, but from @@ -8193,15 +9636,15 @@ send a bug report just because you can not provide a fix. Send bug reports to @email{bug-bison@@gnu.org}. -@node Other Languages -@section Other Languages +@node More Languages +@section More Languages @display -Will Bison ever have C++ support? How about Java or @var{insert your +Will Bison ever have C++ and Java support? How about @var{insert your favorite language here}? @end display -C++ support is there now, and is documented. We'd love to add other +C++ and Java support is there now, and is documented. We'd love to add other languages; contributions are welcome. @node Beta Testing @@ -8292,90 +9735,38 @@ Separates alternate rules for the same result nonterminal. @xref{Rules, ,Syntax of Grammar Rules}. @end deffn -@deffn {Symbol} $accept -The predefined nonterminal whose only rule is @samp{$accept: @var{start} -$end}, where @var{start} is the start symbol. @xref{Start Decl, , The -Start-Symbol}. It cannot be used in the grammar. -@end deffn - -@deffn {Directive} %after-header @{@var{code}@} -Specifies code to be inserted into the code file after the contents of the -header file. -@xref{Table of Symbols, ,%start-header}. -@end deffn +@deffn {Directive} <*> +Used to define a default tagged @code{%destructor} or default tagged +@code{%printer}. -@deffn {Directive} %before-header @{@var{code}@} -Specifies code to be inserted into the code file before the contents of the -header file. -@xref{Table of Symbols, ,%start-header}. -@end deffn +This feature is experimental. +More user feedback will help to determine whether it should become a permanent +feature. -@deffn {Directive} %end-header @{@var{code}@} -Specifies code to be inserted both into the header file (if generated; -@pxref{Table of Symbols, ,%defines}) and into the code file after any -Bison-generated definitions. -@xref{Table of Symbols, ,%start-header}. +@xref{Destructor Decl, , Freeing Discarded Symbols}. @end deffn -@deffn {Directive} %start-header @{@var{code}@} -Specifies code to be inserted both into the header file (if generated; -@pxref{Table of Symbols, ,%defines}) and into the code file before any -Bison-generated definitions. - -@cindex Prologue -@findex %before-header -@findex %union -@findex %end-header -@findex %after-header -For example, the following declaration order in the grammar file reflects the -order in which Bison will output these code blocks. However, you are free to -declare these code blocks in your grammar file in whatever order is most -convenient for you: +@deffn {Directive} <> +Used to define a default tagless @code{%destructor} or default tagless +@code{%printer}. -@smallexample -%before-header @{ - /* Bison treats this block like a pre-prologue block: it inserts it - * into the code file before the contents of the header file. It - * does *not* insert it into the header file. This is a good place - * to put #include's that you want at the top of your code file. A - * common example is `#include "system.h"'. */ -@} -%start-header @{ - /* Bison inserts this block into both the header file and the code - * file. In both files, the point of insertion is before any - * Bison-generated token, semantic type, location type, and class - * definitions. This is a good place to define %union - * dependencies, for example. */ -@} -%union @{ - /* Unlike the traditional Yacc prologue blocks, the output order - * for the %*-header blocks is not affected by their declaration - * position relative to any %union in the grammar file. */ -@} -%end-header @{ - /* Bison inserts this block into both the header file and the code - * file. In both files, the point of insertion is after the - * Bison-generated definitions. This is a good place to declare or - * define public functions or data structures that depend on the - * Bison-generated definitions. */ -@} -%after-header @{ - /* Bison treats this block like a post-prologue block: it inserts - * it into the code file after the contents of the header file. It - * does *not* insert it into the header file. This is a good place - * to declare or define internal functions or data structures that - * depend on the Bison-generated definitions. */ -@} -@end smallexample +This feature is experimental. +More user feedback will help to determine whether it should become a permanent +feature. -If you have multiple occurrences of any one of the above declarations, Bison -will concatenate the contents in declaration order. +@xref{Destructor Decl, , Freeing Discarded Symbols}. +@end deffn -@xref{Prologue, ,The Prologue}. +@deffn {Symbol} $accept +The predefined nonterminal whose only rule is @samp{$accept: @var{start} +$end}, where @var{start} is the start symbol. @xref{Start Decl, , The +Start-Symbol}. It cannot be used in the grammar. @end deffn -@deffn {Directive} %debug -Equip the parser for debugging. @xref{Decl Summary}. +@deffn {Directive} %code @{@var{code}@} +@deffnx {Directive} %code @var{qualifier} @{@var{code}@} +Insert @var{code} verbatim into output parser source. +@xref{Decl Summary,,%code}. @end deffn @deffn {Directive} %debug @@ -8390,11 +9781,22 @@ Precedence}. @end deffn @end ifset +@deffn {Directive} %define @var{define-variable} +@deffnx {Directive} %define @var{define-variable} @var{value} +Define a variable to adjust Bison's behavior. +@xref{Decl Summary,,%define}. +@end deffn + @deffn {Directive} %defines Bison declaration to create a header file meant for the scanner. @xref{Decl Summary}. @end deffn +@deffn {Directive} %defines @var{defines-file} +Same as above, but save in the file @var{defines-file}. +@xref{Decl Summary}. +@end deffn + @deffn {Directive} %destructor Specify how the parser should reclaim the memory associated to discarded symbols. @xref{Destructor Decl, , Freeing Discarded Symbols}. @@ -8427,7 +9829,7 @@ Bison declaration to request verbose, specific error message strings when @code{yyerror} is called. @end deffn -@deffn {Directive} %file-prefix="@var{prefix}" +@deffn {Directive} %file-prefix "@var{prefix}" Bison declaration to set the prefix of the output files. @xref{Decl Summary}. @end deffn @@ -8441,6 +9843,11 @@ Parsers, ,Writing @acronym{GLR} Parsers}. Run user code before parsing. @xref{Initial Action Decl, , Performing Actions before Parsing}. @end deffn +@deffn {Directive} %language +Specify the programming language for the generated parser. +@xref{Decl Summary}. +@end deffn + @deffn {Directive} %left Bison declaration to assign left associativity to token(s). @xref{Precedence Decl, ,Operator Precedence}. @@ -8459,7 +9866,7 @@ function is applied to the two semantic values to get a single result. @xref{GLR Parsers, ,Writing @acronym{GLR} Parsers}. @end deffn -@deffn {Directive} %name-prefix="@var{prefix}" +@deffn {Directive} %name-prefix "@var{prefix}" Bison declaration to rename the external symbols. @xref{Decl Summary}. @end deffn @@ -8481,7 +9888,7 @@ Bison declaration to assign nonassociativity to token(s). @xref{Precedence Decl, ,Operator Precedence}. @end deffn -@deffn {Directive} %output="@var{file}" +@deffn {Directive} %output "@var{file}" Bison declaration to set the name of the parser file. @xref{Decl Summary}. @end deffn @@ -8498,8 +9905,8 @@ Bison declaration to assign a precedence to a specific rule. @end deffn @deffn {Directive} %pure-parser -Bison declaration to request a pure (reentrant) parser. -@xref{Pure Decl, ,A Pure (Reentrant) Parser}. +Deprecated version of @code{%define api.pure} (@pxref{Decl Summary, ,%define}), +for which Bison is more careful to warn about unreasonable usage. @end deffn @deffn {Directive} %require "@var{version}" @@ -8512,16 +9919,16 @@ Bison declaration to assign right associativity to token(s). @xref{Precedence Decl, ,Operator Precedence}. @end deffn +@deffn {Directive} %skeleton +Specify the skeleton to use; usually for development. +@xref{Decl Summary}. +@end deffn + @deffn {Directive} %start Bison declaration to specify the start symbol. @xref{Start Decl, ,The Start-Symbol}. @end deffn -@deffn {Directive} %symbol-default -Used to declare a default @code{%destructor} or default @code{%printer}. -@xref{Destructor Decl, , Freeing Discarded Symbols}. -@end deffn - @deffn {Directive} %token Bison declaration to declare token(s) without specifying precedence. @xref{Token Decl, ,Token Type Names}. @@ -8553,12 +9960,18 @@ Macro to pretend that an unrecoverable syntax error has occurred, by making @code{yyparse} return 1 immediately. The error reporting function @code{yyerror} is not called. @xref{Parser Function, ,The Parser Function @code{yyparse}}. + +For Java parsers, this functionality is invoked using @code{return YYABORT;} +instead. @end deffn @deffn {Macro} YYACCEPT Macro to pretend that a complete utterance of the language has been read, by making @code{yyparse} return 0 immediately. @xref{Parser Function, ,The Parser Function @code{yyparse}}. + +For Java parsers, this functionality is invoked using @code{return YYACCEPT;} +instead. @end deffn @deffn {Macro} YYBACKUP @@ -8599,6 +10012,9 @@ Macro to pretend that a syntax error has just been detected: call @code{yyerror} and then perform normal error recovery if possible (@pxref{Error Recovery}), or (if recovery is impossible) make @code{yyparse} return 1. @xref{Error Recovery}. + +For Java parsers, this functionality is invoked using @code{return YYERROR;} +instead. @end deffn @deffn {Function} yyerror @@ -8667,7 +10083,8 @@ Management}. @deffn {Variable} yynerrs Global variable which Bison increments each time it reports a syntax error. -(In a pure parser, it is a local variable within @code{yyparse}.) +(In a pure parser, it is a local variable within @code{yyparse}. In a +pure push parser, it is a member of yypstate.) @xref{Error Reporting, ,The Error Reporting Function @code{yyerror}}. @end deffn @@ -8676,6 +10093,41 @@ The parser function produced by Bison; call this function to start parsing. @xref{Parser Function, ,The Parser Function @code{yyparse}}. @end deffn +@deffn {Function} yypstate_delete +The function to delete a parser instance, produced by Bison in push mode; +call this function to delete the memory associated with a parser. +@xref{Parser Delete Function, ,The Parser Delete Function +@code{yypstate_delete}}. +(The current push parsing interface is experimental and may evolve. +More user feedback will help to stabilize it.) +@end deffn + +@deffn {Function} yypstate_new +The function to create a parser instance, produced by Bison in push mode; +call this function to create a new parser. +@xref{Parser Create Function, ,The Parser Create Function +@code{yypstate_new}}. +(The current push parsing interface is experimental and may evolve. +More user feedback will help to stabilize it.) +@end deffn + +@deffn {Function} yypull_parse +The parser function produced by Bison in push mode; call this function to +parse the rest of the input stream. +@xref{Pull Parser Function, ,The Pull Parser Function +@code{yypull_parse}}. +(The current push parsing interface is experimental and may evolve. +More user feedback will help to stabilize it.) +@end deffn + +@deffn {Function} yypush_parse +The parser function produced by Bison in push mode; call this function to +parse a single token. @xref{Push Parser Function, ,The Push Parser Function +@code{yypush_parse}}. +(The current push parsing interface is experimental and may evolve. +More user feedback will help to stabilize it.) +@end deffn + @deffn {Macro} YYPARSE_PARAM An obsolete macro for specifying the name of a parameter that @code{yyparse} should accept. The use of this macro is deprecated, and @@ -8884,11 +10336,6 @@ grammatically indivisible. The piece of text it represents is a token. @node Copying This Manual @appendix Copying This Manual - -@menu -* GNU Free Documentation License:: License for copying this manual. -@end menu - @include fdl.texi @node Index @@ -8898,32 +10345,36 @@ grammatically indivisible. The piece of text it represents is a token. @bye +@c Local Variables: +@c fill-column: 76 +@c End: + @c LocalWords: texinfo setfilename settitle setchapternewpage finalout @c LocalWords: ifinfo smallbook shorttitlepage titlepage GPL FIXME iftex @c LocalWords: akim fn cp syncodeindex vr tp synindex dircategory direntry @c LocalWords: ifset vskip pt filll insertcopying sp ISBN Etienne Suvasa @c LocalWords: ifnottex yyparse detailmenu GLR RPN Calc var Decls Rpcalc -@c LocalWords: rpcalc Lexer Gen Comp Expr ltcalc mfcalc Decl Symtab yylex +@c LocalWords: rpcalc Lexer Expr ltcalc mfcalc yylex @c LocalWords: yyerror pxref LR yylval cindex dfn LALR samp gpl BNF xref @c LocalWords: const int paren ifnotinfo AC noindent emph expr stmt findex @c LocalWords: glr YYSTYPE TYPENAME prog dprec printf decl init stmtMerge @c LocalWords: pre STDC GNUC endif yy YY alloca lf stddef stdlib YYDEBUG @c LocalWords: NUM exp subsubsection kbd Ctrl ctype EOF getchar isdigit @c LocalWords: ungetc stdin scanf sc calc ulator ls lm cc NEG prec yyerrok -@c LocalWords: longjmp fprintf stderr preg yylloc YYLTYPE cos ln +@c LocalWords: longjmp fprintf stderr yylloc YYLTYPE cos ln @c LocalWords: smallexample symrec val tptr FNCT fnctptr func struct sym @c LocalWords: fnct putsym getsym fname arith fncts atan ptr malloc sizeof @c LocalWords: strlen strcpy fctn strcmp isalpha symbuf realloc isalnum @c LocalWords: ptypes itype YYPRINT trigraphs yytname expseq vindex dtype -@c LocalWords: Rhs YYRHSLOC LE nonassoc op deffn typeless typefull yynerrs +@c LocalWords: Rhs YYRHSLOC LE nonassoc op deffn typeless yynerrs @c LocalWords: yychar yydebug msg YYNTOKENS YYNNTS YYNRULES YYNSTATES @c LocalWords: cparse clex deftypefun NE defmac YYACCEPT YYABORT param @c LocalWords: strncmp intval tindex lvalp locp llocp typealt YYBACKUP @c LocalWords: YYEMPTY YYEOF YYRECOVERING yyclearin GE def UMINUS maybeword @c LocalWords: Johnstone Shamsa Sadaf Hussain Tomita TR uref YYMAXDEPTH -@c LocalWords: YYINITDEPTH stmnts ref stmnt initdcl maybeasm VCG notype +@c LocalWords: YYINITDEPTH stmnts ref stmnt initdcl maybeasm notype @c LocalWords: hexflag STR exdent itemset asis DYYDEBUG YYFPRINTF args -@c LocalWords: infile ypp yxx outfile itemx vcg tex leaderfill +@c LocalWords: infile ypp yxx outfile itemx tex leaderfill @c LocalWords: hbox hss hfill tt ly yyin fopen fclose ofirst gcc ll -@c LocalWords: yyrestart nbar yytext fst snd osplit ntwo strdup AST +@c LocalWords: nbar yytext fst snd osplit ntwo strdup AST @c LocalWords: YYSTACK DVI fdl printindex