@value{UPDATED}), the @acronym{GNU} parser generator.
Copyright @copyright{} 1988, 1989, 1990, 1991, 1992, 1993, 1995, 1998,
-1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006 Free Software Foundation, Inc.
+1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007 Free Software Foundation, Inc.
@quotation
Permission is granted to copy, distribute and/or modify this document
messy for Bison to handle straightforwardly.
* Debugging:: Understanding or debugging Bison parsers.
* Invocation:: How to run Bison (to produce the parser source file).
-* C++ Language Interface:: Creating C++ parser objects.
+* Other Languages:: Creating C++ and Java parsers.
* FAQ:: Frequently Asked Questions
* Table of Symbols:: All the keywords of the Bison language are explained.
* Glossary:: Basic concepts are explained.
Outline of a Bison Grammar
* Prologue:: Syntax and usage of the prologue.
+* Prologue Alternatives:: Syntax and usage of alternatives to the prologue.
* Bison Declarations:: Syntax and usage of the Bison declarations section.
* Grammar Rules:: Syntax and usage of the grammar rules section.
* Epilogue:: Syntax and usage of the epilogue.
* Option Cross Key:: Alphabetical list of long options.
* Yacc Library:: Yacc-compatible @code{yylex} and @code{main}.
-C++ Language Interface
+Parsers Written In Other Languages
* C++ Parsers:: The interface to generate C++ parser classes
-* A Complete C++ Example:: Demonstrating their use
+* Java Parsers:: The interface to generate Java parser classes
C++ Parsers
* C++ Location Values:: The position and location classes
* C++ Parser Interface:: Instantiating and running the parser
* C++ Scanner Interface:: Exchanges between yylex and parse
+* A Complete C++ Example:: Demonstrating their use
A Complete C++ Example
* Calc++ Scanner:: A pure C++ Flex scanner
* Calc++ Top Level:: Conducting the band
+Java Parsers
+
+* Java Bison Interface:: Asking for Java parser generation
+* Java Semantic Values:: %type and %token vs. Java
+* Java Location Values:: The position and location classes
+* Java Parser Interface:: Instantiating and running the parser
+* Java Scanner Interface:: Java scanners, and pure parsers
+* Java Differences:: Differences between C/C++ and Java Grammars
+
Frequently Asked Questions
* Memory Exhausted:: Breaking the Stack Limits
by default (@pxref{Location Type, ,Data Types of Locations}), which is a
four member structure with the following integer fields:
@code{first_line}, @code{first_column}, @code{last_line} and
-@code{last_column}.
+@code{last_column}. By conventions, and in accordance with the GNU
+Coding Standards and common practice, the line and column count both
+start at 1.
@node Ltcalc Rules
@subsection Grammar Rules for @code{ltcalc}
@}
@end group
@group
- | '-' exp %preg NEG @{ $$ = -$2; @}
+ | '-' exp %prec NEG @{ $$ = -$2; @}
| exp '^' exp @{ $$ = pow ($1, $3); @}
| '(' exp ')' @{ $$ = $2; @}
@end group
@menu
* Prologue:: Syntax and usage of the prologue.
+* Prologue Alternatives:: Syntax and usage of alternatives to the prologue.
* Bison Declarations:: Syntax and usage of the Bison declarations section.
* Grammar Rules:: Syntax and usage of the grammar rules section.
* Epilogue:: Syntax and usage of the epilogue.
don't need any C declarations, you may omit the @samp{%@{} and
@samp{%@}} delimiters that bracket this section.
-The @var{Prologue} section is terminated by the the first occurrence
+The @var{Prologue} section is terminated by the first occurrence
of @samp{%@}} that is outside a comment, a string literal, or a
character constant.
@smallexample
%@{
+ #define _GNU_SOURCE
#include <stdio.h>
#include "ptypes.h"
%@}
@dots{}
@end smallexample
+When in doubt, it is usually safer to put prologue code before all
+Bison declarations, rather than after. For example, any definitions
+of feature test macros like @code{_GNU_SOURCE} or
+@code{_POSIX_C_SOURCE} should appear before all Bison declarations, as
+feature test macros can affect the behavior of Bison-generated
+@code{#include} directives.
+
+@node Prologue Alternatives
+@subsection Prologue Alternatives
+@cindex Prologue Alternatives
+
+@findex %code
+@findex %code requires
+@findex %code provides
+@findex %code top
+(The prologue alternatives described here are experimental.
+More user feedback will help to determine whether they should become permanent
+features.)
+
+The functionality of @var{Prologue} sections can often be subtle and
+inflexible.
+As an alternative, Bison provides a %code directive with an explicit qualifier
+field, which identifies the purpose of the code and thus the location(s) where
+Bison should generate it.
+For C/C++, the qualifier can be omitted for the default location, or it can be
+one of @code{requires}, @code{provides}, @code{top}.
+@xref{Decl Summary,,%code}.
+
+Look again at the example of the previous section:
+
+@smallexample
+%@{
+ #define _GNU_SOURCE
+ #include <stdio.h>
+ #include "ptypes.h"
+%@}
+
+%union @{
+ long int n;
+ tree t; /* @r{@code{tree} is defined in @file{ptypes.h}.} */
+@}
+
+%@{
+ static void print_token_value (FILE *, int, YYSTYPE);
+ #define YYPRINT(F, N, L) print_token_value (F, N, L)
+%@}
+
+@dots{}
+@end smallexample
+
+@noindent
+Notice that there are two @var{Prologue} sections here, but there's a subtle
+distinction between their functionality.
+For example, if you decide to override Bison's default definition for
+@code{YYLTYPE}, in which @var{Prologue} section should you write your new
+definition?
+You should write it in the first since Bison will insert that code into the
+parser source code file @emph{before} the default @code{YYLTYPE} definition.
+In which @var{Prologue} section should you prototype an internal function,
+@code{trace_token}, that accepts @code{YYLTYPE} and @code{yytokentype} as
+arguments?
+You should prototype it in the second since Bison will insert that code
+@emph{after} the @code{YYLTYPE} and @code{yytokentype} definitions.
+
+This distinction in functionality between the two @var{Prologue} sections is
+established by the appearance of the @code{%union} between them.
+This behavior raises a few questions.
+First, why should the position of a @code{%union} affect definitions related to
+@code{YYLTYPE} and @code{yytokentype}?
+Second, what if there is no @code{%union}?
+In that case, the second kind of @var{Prologue} section is not available.
+This behavior is not intuitive.
+
+To avoid this subtle @code{%union} dependency, rewrite the example using a
+@code{%code top} and an unqualified @code{%code}.
+Let's go ahead and add the new @code{YYLTYPE} definition and the
+@code{trace_token} prototype at the same time:
+
+@smallexample
+%code top @{
+ #define _GNU_SOURCE
+ #include <stdio.h>
+
+ /* WARNING: The following code really belongs
+ * in a `%code requires'; see below. */
+
+ #include "ptypes.h"
+ #define YYLTYPE YYLTYPE
+ typedef struct YYLTYPE
+ @{
+ int first_line;
+ int first_column;
+ int last_line;
+ int last_column;
+ char *filename;
+ @} YYLTYPE;
+@}
+
+%union @{
+ long int n;
+ tree t; /* @r{@code{tree} is defined in @file{ptypes.h}.} */
+@}
+
+%code @{
+ static void print_token_value (FILE *, int, YYSTYPE);
+ #define YYPRINT(F, N, L) print_token_value (F, N, L)
+ static void trace_token (enum yytokentype token, YYLTYPE loc);
+@}
+
+@dots{}
+@end smallexample
+
+@noindent
+In this way, @code{%code top} and the unqualified @code{%code} achieve the same
+functionality as the two kinds of @var{Prologue} sections, but it's always
+explicit which kind you intend.
+Moreover, both kinds are always available even in the absence of @code{%union}.
+
+The @code{%code top} block above logically contains two parts.
+The first two lines before the warning need to appear near the top of the
+parser source code file.
+The first line after the warning is required by @code{YYSTYPE} and thus also
+needs to appear in the parser source code file.
+However, if you've instructed Bison to generate a parser header file
+(@pxref{Decl Summary, ,%defines}), you probably want that line to appear before
+the @code{YYSTYPE} definition in that header file as well.
+The @code{YYLTYPE} definition should also appear in the parser header file to
+override the default @code{YYLTYPE} definition there.
+
+In other words, in the @code{%code top} block above, all but the first two
+lines are dependency code required by the @code{YYSTYPE} and @code{YYLTYPE}
+definitions.
+Thus, they belong in one or more @code{%code requires}:
+
+@smallexample
+%code top @{
+ #define _GNU_SOURCE
+ #include <stdio.h>
+@}
+
+%code requires @{
+ #include "ptypes.h"
+@}
+%union @{
+ long int n;
+ tree t; /* @r{@code{tree} is defined in @file{ptypes.h}.} */
+@}
+
+%code requires @{
+ #define YYLTYPE YYLTYPE
+ typedef struct YYLTYPE
+ @{
+ int first_line;
+ int first_column;
+ int last_line;
+ int last_column;
+ char *filename;
+ @} YYLTYPE;
+@}
+
+%code @{
+ static void print_token_value (FILE *, int, YYSTYPE);
+ #define YYPRINT(F, N, L) print_token_value (F, N, L)
+ static void trace_token (enum yytokentype token, YYLTYPE loc);
+@}
+
+@dots{}
+@end smallexample
+
+@noindent
+Now Bison will insert @code{#include "ptypes.h"} and the new @code{YYLTYPE}
+definition before the Bison-generated @code{YYSTYPE} and @code{YYLTYPE}
+definitions in both the parser source code file and the parser header file.
+(By the same reasoning, @code{%code requires} would also be the appropriate
+place to write your own definition for @code{YYSTYPE}.)
+
+When you are writing dependency code for @code{YYSTYPE} and @code{YYLTYPE}, you
+should prefer @code{%code requires} over @code{%code top} regardless of whether
+you instruct Bison to generate a parser header file.
+When you are writing code that you need Bison to insert only into the parser
+source code file and that has no special need to appear at the top of that
+file, you should prefer the unqualified @code{%code} over @code{%code top}.
+These practices will make the purpose of each block of your code explicit to
+Bison and to other developers reading your grammar file.
+Following these practices, we expect the unqualified @code{%code} and
+@code{%code requires} to be the most important of the four @var{Prologue}
+alternatives.
+
+At some point while developing your parser, you might decide to provide
+@code{trace_token} to modules that are external to your parser.
+Thus, you might wish for Bison to insert the prototype into both the parser
+header file and the parser source code file.
+Since this function is not a dependency required by @code{YYSTYPE} or
+@code{YYLTYPE}, it doesn't make sense to move its prototype to a
+@code{%code requires}.
+More importantly, since it depends upon @code{YYLTYPE} and @code{yytokentype},
+@code{%code requires} is not sufficient.
+Instead, move its prototype from the unqualified @code{%code} to a
+@code{%code provides}:
+
+@smallexample
+%code top @{
+ #define _GNU_SOURCE
+ #include <stdio.h>
+@}
+
+%code requires @{
+ #include "ptypes.h"
+@}
+%union @{
+ long int n;
+ tree t; /* @r{@code{tree} is defined in @file{ptypes.h}.} */
+@}
+
+%code requires @{
+ #define YYLTYPE YYLTYPE
+ typedef struct YYLTYPE
+ @{
+ int first_line;
+ int first_column;
+ int last_line;
+ int last_column;
+ char *filename;
+ @} YYLTYPE;
+@}
+
+%code provides @{
+ void trace_token (enum yytokentype token, YYLTYPE loc);
+@}
+
+%code @{
+ static void print_token_value (FILE *, int, YYSTYPE);
+ #define YYPRINT(F, N, L) print_token_value (F, N, L)
+@}
+
+@dots{}
+@end smallexample
+
+@noindent
+Bison will insert the @code{trace_token} prototype into both the parser header
+file and the parser source code file after the definitions for
+@code{yytokentype}, @code{YYLTYPE}, and @code{YYSTYPE}.
+
+The above examples are careful to write directives in an order that reflects
+the layout of the generated parser source code and header files:
+@code{%code top}, @code{%code requires}, @code{%code provides}, and then
+@code{%code}.
+While your grammar files may generally be easier to read if you also follow
+this order, Bison does not require it.
+Instead, Bison lets you choose an organization that makes sense to you.
+
+You may declare any of these directives multiple times in the grammar file.
+In that case, Bison concatenates the contained code in declaration order.
+This is the only way in which the position of one of these directives within
+the grammar file affects its functionality.
+
+The result of the previous two properties is greater flexibility in how you may
+organize your grammar file.
+For example, you may organize semantic-type-related directives by semantic
+type:
+
+@smallexample
+%code requires @{ #include "type1.h" @}
+%union @{ type1 field1; @}
+%destructor @{ type1_free ($$); @} <field1>
+%printer @{ type1_print ($$); @} <field1>
+
+%code requires @{ #include "type2.h" @}
+%union @{ type2 field2; @}
+%destructor @{ type2_free ($$); @} <field2>
+%printer @{ type2_print ($$); @} <field2>
+@end smallexample
+
+@noindent
+You could even place each of the above directive groups in the rules section of
+the grammar file next to the set of rules that uses the associated semantic
+type.
+(In the rules section, you must terminate each of those directives with a
+semicolon.)
+And you don't have to worry that some directive (like a @code{%union}) in the
+definitions section is going to adversely affect their functionality in some
+counter-intuitive manner just because it comes first.
+Such an organization is not possible using @var{Prologue} sections.
+
+This section has been concerned with explaining the advantages of the four
+@var{Prologue} alternatives over the original Yacc @var{Prologue}.
+However, in most cases when using these directives, you shouldn't need to
+think about all the low-level ordering issues discussed here.
+Instead, you should simply use these directives to label each block of your
+code according to its purpose and let Bison handle the ordering.
+@code{%code} is the most generic label.
+Move code to @code{%code requires}, @code{%code provides}, or @code{%code top}
+as needed.
+
@node Bison Declarations
@subsection The Bison Declarations Section
@cindex Bison declarations (introduction)
@acronym{RPN} and infix calculator examples (@pxref{RPN Calc, ,Reverse Polish
Notation Calculator}).
-Bison's default is to use type @code{int} for all semantic values. To
+Bison normally uses the type @code{int} for semantic values if your
+program uses the same data type for all language constructs. To
specify some other type, define @code{YYSTYPE} as a macro, like this:
@example
@itemize @bullet
@item
-Specify the entire collection of possible data types, with the
+Specify the entire collection of possible data types, either by using the
@code{%union} Bison declaration (@pxref{Union Decl, ,The Collection of
-Value Types}).
+Value Types}), or by using a @code{typedef} or a @code{#define} to
+define @code{YYSTYPE} to be a union type whose member names are
+the type tags.
@item
Choose one of those types for each symbol (terminal or nonterminal) for
restoring it.
Thus, @code{$<context>5} needs a destructor (@pxref{Destructor Decl, , Freeing
Discarded Symbols}).
-However, Bison currently provides no means to declare a destructor for a
-mid-rule action's semantic value.
+However, Bison currently provides no means to declare a destructor specific to
+a particular mid-rule action's semantic value.
One solution is to bury the mid-rule action inside a nonterminal symbol and to
declare a destructor for that symbol:
You can specify the type of locations by defining a macro called
@code{YYLTYPE}, just as you can specify the semantic value type by
-defining @code{YYSTYPE} (@pxref{Value Type}).
+defining a @code{YYSTYPE} macro (@pxref{Value Type}).
When @code{YYLTYPE} is not defined, Bison uses a default structure type with
four members:
@} YYLTYPE;
@end example
+At the beginning of the parsing, Bison initializes all these fields to 1
+for @code{yylloc}.
+
@node Actions and Locations
@subsection Actions and Locations
@cindex location actions
Note that, unlike making a @code{union} declaration in C, you need not write
a semicolon after the closing brace.
+Instead of @code{%union}, you can define and use your own union type
+@code{YYSTYPE} if your grammar contains at least one
+@samp{<@var{type}>} tag. For example, you can put the following into
+a header file @file{parser.h}:
+
+@example
+@group
+union YYSTYPE @{
+ double val;
+ symrec *tptr;
+@};
+typedef union YYSTYPE YYSTYPE;
+@end group
+@end example
+
+@noindent
+and then your grammar can use the following
+instead of @code{%union}:
+
+@example
+@group
+%@{
+#include "parser.h"
+%@}
+%type <val> expr
+%token <tptr> ID
+@end group
+@end example
+
@node Type Decl
@subsection Nonterminal Symbols
@cindex declaring value types, nonterminals
@subsection Freeing Discarded Symbols
@cindex freeing discarded symbols
@findex %destructor
-
+@findex <*>
+@findex <>
During error recovery (@pxref{Error Recovery}), symbols already pushed
on the stack and tokens coming from the rest of the file are discarded
until the parser falls on its feet. If the parser runs out of memory,
Invoke the braced @var{code} whenever the parser discards one of the
@var{symbols}.
Within @var{code}, @code{$$} designates the semantic value associated
-with the discarded symbol. The additional parser parameters are also
-available (@pxref{Parser Function, , The Parser Function
-@code{yyparse}}).
+with the discarded symbol, and @code{@@$} designates its location.
+The additional parser parameters are also available (@pxref{Parser Function, ,
+The Parser Function @code{yyparse}}).
+
+When a symbol is listed among @var{symbols}, its @code{%destructor} is called a
+per-symbol @code{%destructor}.
+You may also define a per-type @code{%destructor} by listing a semantic type
+tag among @var{symbols}.
+In that case, the parser will invoke this @var{code} whenever it discards any
+grammar symbol that has that semantic type tag unless that symbol has its own
+per-symbol @code{%destructor}.
+
+Finally, you can define two different kinds of default @code{%destructor}s.
+(These default forms are experimental.
+More user feedback will help to determine whether they should become permanent
+features.)
+You can place each of @code{<*>} and @code{<>} in the @var{symbols} list of
+exactly one @code{%destructor} declaration in your grammar file.
+The parser will invoke the @var{code} associated with one of these whenever it
+discards any user-defined grammar symbol that has no per-symbol and no per-type
+@code{%destructor}.
+The parser uses the @var{code} for @code{<*>} in the case of such a grammar
+symbol for which you have formally declared a semantic type tag (@code{%type}
+counts as such a declaration, but @code{$<tag>$} does not).
+The parser uses the @var{code} for @code{<>} in the case of such a grammar
+symbol that has no declared semantic type tag.
@end deffn
-For instance:
+@noindent
+For example:
@smallexample
-%union
-@{
- char *string;
-@}
-%token <string> STRING
-%type <string> string
-%destructor @{ free ($$); @} STRING string
+%union @{ char *string; @}
+%token <string> STRING1
+%token <string> STRING2
+%type <string> string1
+%type <string> string2
+%union @{ char character; @}
+%token <character> CHR
+%type <character> chr
+%token TAGLESS
+
+%destructor @{ @} <character>
+%destructor @{ free ($$); @} <*>
+%destructor @{ free ($$); printf ("%d", @@$.first_line); @} STRING1 string1
+%destructor @{ printf ("Discarding tagless symbol.\n"); @} <>
@end smallexample
@noindent
-guarantees that when a @code{STRING} or a @code{string} is discarded,
-its associated memory will be freed.
+guarantees that, when the parser discards any user-defined symbol that has a
+semantic type tag other than @code{<character>}, it passes its semantic value
+to @code{free} by default.
+However, when the parser discards a @code{STRING1} or a @code{string1}, it also
+prints its line number to @code{stdout}.
+It performs only the second @code{%destructor} in this case, so it invokes
+@code{free} only once.
+Finally, the parser merely prints a message whenever it discards any symbol,
+such as @code{TAGLESS}, that has no semantic type tag.
+
+A Bison-generated parser invokes the default @code{%destructor}s only for
+user-defined as opposed to Bison-defined symbols.
+For example, the parser will not invoke either kind of default
+@code{%destructor} for the special Bison-defined symbols @code{$accept},
+@code{$undefined}, or @code{$end} (@pxref{Table of Symbols, ,Bison Symbols}),
+none of which you can reference in your grammar.
+It also will not invoke either for the @code{error} token (@pxref{Table of
+Symbols, ,error}), which is always defined by Bison regardless of whether you
+reference it in your grammar.
+However, it may invoke one of them for the end token (token 0) if you
+redefine it from @code{$end} to, for example, @code{END}:
+
+@smallexample
+%token END 0
+@end smallexample
+
+@cindex actions in mid-rule
+@cindex mid-rule actions
+Finally, Bison will never invoke a @code{%destructor} for an unreferenced
+mid-rule semantic value (@pxref{Mid-Rule Actions,,Actions in Mid-Rule}).
+That is, Bison does not consider a mid-rule to have a semantic value if you do
+not reference @code{$$} in the mid-rule's action or @code{$@var{n}} (where
+@var{n} is the RHS symbol position of the mid-rule) in any later action in that
+rule.
+However, if you do reference either, the Bison-generated parser will invoke the
+@code{<>} @code{%destructor} whenever it discards the mid-rule symbol.
+
+@ignore
+@noindent
+In the future, it may be possible to redefine the @code{error} token as a
+nonterminal that captures the discarded symbols.
+In that case, the parser will invoke the default destructor for it as well.
+@end ignore
@sp 1
@code{YYABORT} or @code{YYACCEPT}, or failed error recovery, or memory
exhaustion.
-Right-hand size symbols of a rule that explicitly triggers a syntax
+Right-hand side symbols of a rule that explicitly triggers a syntax
error via @code{YYERROR} are not discarded automatically. As a rule
of thumb, destructors are invoked only when user actions cannot manage
the memory.
In order to change the behavior of @command{bison}, use the following
directives:
+@deffn {Directive} %code @{@var{code}@}
+@findex %code
+This is the unqualified form of the @code{%code} directive.
+It inserts @var{code} verbatim at a language-dependent default location in the
+output@footnote{The default location is actually skeleton-dependent;
+ writers of non-standard skeletons however should choose the default location
+ consistently with the behavior of the standard Bison skeletons.}.
+
+@cindex Prologue
+For C/C++, the default location is the parser source code
+file after the usual contents of the parser header file.
+Thus, @code{%code} replaces the traditional Yacc prologue,
+@code{%@{@var{code}%@}}, for most purposes.
+For a detailed discussion, see @ref{Prologue Alternatives}.
+
+For Java, the default location is inside the parser class.
+
+(Like all the Yacc prologue alternatives, this directive is experimental.
+More user feedback will help to determine whether it should become a permanent
+feature.)
+@end deffn
+
+@deffn {Directive} %code @var{qualifier} @{@var{code}@}
+This is the qualified form of the @code{%code} directive.
+If you need to specify location-sensitive verbatim @var{code} that does not
+belong at the default location selected by the unqualified @code{%code} form,
+use this form instead.
+
+@var{qualifier} identifies the purpose of @var{code} and thus the location(s)
+where Bison should generate it.
+Not all values of @var{qualifier} are available for all target languages:
+
+@itemize @bullet
+@findex %code requires
+@item requires
+
+@itemize @bullet
+@item Language(s): C, C++
+
+@item Purpose: This is the best place to write dependency code required for
+@code{YYSTYPE} and @code{YYLTYPE}.
+In other words, it's the best place to define types referenced in @code{%union}
+directives, and it's the best place to override Bison's default @code{YYSTYPE}
+and @code{YYLTYPE} definitions.
+
+@item Location(s): The parser header file and the parser source code file
+before the Bison-generated @code{YYSTYPE} and @code{YYLTYPE} definitions.
+@end itemize
+
+@item provides
+@findex %code provides
+
+@itemize @bullet
+@item Language(s): C, C++
+
+@item Purpose: This is the best place to write additional definitions and
+declarations that should be provided to other modules.
+
+@item Location(s): The parser header file and the parser source code file after
+the Bison-generated @code{YYSTYPE}, @code{YYLTYPE}, and token definitions.
+@end itemize
+
+@item top
+@findex %code top
+
+@itemize @bullet
+@item Language(s): C, C++
+
+@item Purpose: The unqualified @code{%code} or @code{%code requires} should
+usually be more appropriate than @code{%code top}.
+However, occasionally it is necessary to insert code much nearer the top of the
+parser source code file.
+For example:
+
+@smallexample
+%code top @{
+ #define _GNU_SOURCE
+ #include <stdio.h>
+@}
+@end smallexample
+
+@item Location(s): Near the top of the parser source code file.
+@end itemize
+
+@item imports
+@findex %code imports
+
+@itemize @bullet
+@item Language(s): Java
+
+@item Purpose: This is the best place to write Java import directives.
+
+@item Location(s): The parser Java file after any Java package directive and
+before any class definitions.
+@end itemize
+@end itemize
+
+(Like all the Yacc prologue alternatives, this directive is experimental.
+More user feedback will help to determine whether it should become a permanent
+feature.)
+
+@cindex Prologue
+For a detailed discussion of how to use @code{%code} in place of the
+traditional Yacc prologue for C/C++, see @ref{Prologue Alternatives}.
+@end deffn
+
@deffn {Directive} %debug
In the parser file, define the macro @code{YYDEBUG} to 1 if it is not
already defined, so that the debugging facilities are compiled.
@end deffn
@xref{Tracing, ,Tracing Your Parser}.
+@deffn {Directive} %define @var{variable}
+@deffnx {Directive} %define @var{variable} "@var{value}"
+Define a variable to adjust Bison's behavior.
+The possible choices for @var{variable}, as well as their meanings, depend on
+the selected target language and/or the parser skeleton (@pxref{Decl
+Summary,,%language}).
+
+Bison will warn if a @var{variable} is defined multiple times.
+
+Omitting @code{"@var{value}"} is always equivalent to specifying it as
+@code{""}.
+
+Some @var{variable}s may be used as booleans.
+In this case, Bison will complain if the variable definition does not meet one
+of the following four conditions:
+
+@enumerate
+@item @code{"@var{value}"} is @code{"true"}
+
+@item @code{"@var{value}"} is omitted (or is @code{""}).
+This is equivalent to @code{"true"}.
+
+@item @code{"@var{value}"} is @code{"false"}.
+
+@item @var{variable} is never defined.
+In this case, Bison selects a default value, which may depend on the selected
+target language and/or parser skeleton.
+@end enumerate
+@end deffn
+
@deffn {Directive} %defines
Write a header file containing macro definitions for the token type
names defined in the grammar as well as a few other declarations.
If the parser output file is named @file{@var{name}.c} then this file
is named @file{@var{name}.h}.
-Unless @code{YYSTYPE} is already defined as a macro, the output header
-declares @code{YYSTYPE}. Therefore, if you are using a @code{%union}
+For C parsers, the output header declares @code{YYSTYPE} unless
+@code{YYSTYPE} is already defined as a macro or you have used a
+@code{<@var{type}>} tag without using @code{%union}.
+Therefore, if you are using a @code{%union}
(@pxref{Multiple Types, ,More Than One Value Type}) with components that
require other definitions, or if you have defined a @code{YYSTYPE} macro
+or type definition
(@pxref{Value Type, ,Data Types of Semantic Values}), you need to
arrange for these definitions to be propagated to all modules, e.g., by
putting them in a prerequisite header that is included both by your
If you have also used locations, the output header declares
@code{YYLTYPE} and @code{yylloc} using a protocol similar to that of
-@code{YYSTYPE} and @code{yylval}. @xref{Locations, ,Tracking
+the @code{YYSTYPE} macro and @code{yylval}. @xref{Locations, ,Tracking
Locations}.
This output file is normally essential if you wish to put the definition
typically needs to be able to refer to the above-mentioned declarations
and to the token type codes. @xref{Token Values, ,Semantic Values of
Tokens}.
+
+@findex %code requires
+@findex %code provides
+If you have declared @code{%code requires} or @code{%code provides}, the output
+header also contains their code.
+@xref{Decl Summary, ,%code}.
+@end deffn
+
+@deffn {Directive} %defines @var{defines-file}
+Same as above, but save in the file @var{defines-file}.
@end deffn
@deffn {Directive} %destructor
discarded symbols. @xref{Destructor Decl, , Freeing Discarded Symbols}.
@end deffn
-@deffn {Directive} %file-prefix="@var{prefix}"
+@deffn {Directive} %file-prefix "@var{prefix}"
Specify a prefix to use for all Bison output file names. The names are
chosen as if the input file were named @file{@var{prefix}.y}.
@end deffn
+@deffn {Directive} %language "@var{language}"
+Specify the programming language for the generated parser. Currently
+supported languages include C and C++.
+@var{language} is case-insensitive.
+@end deffn
+
@deffn {Directive} %locations
Generate the code processing the locations (@pxref{Action Features,
,Special Features for Use in Actions}). This mode is enabled as soon as
accurate syntax error messages.
@end deffn
-@deffn {Directive} %name-prefix="@var{prefix}"
+@deffn {Directive} %name-prefix "@var{prefix}"
Rename the external symbols used in the parser so that they start with
@var{prefix} instead of @samp{yy}. The precise list of symbols renamed
in C parsers
is @code{yyparse}, @code{yylex}, @code{yyerror}, @code{yynerrs},
@code{yylval}, @code{yychar}, @code{yydebug}, and
(if locations are used) @code{yylloc}. For example, if you use
-@samp{%name-prefix="c_"}, the names become @code{c_parse}, @code{c_lex},
+@samp{%name-prefix "c_"}, the names become @code{c_parse}, @code{c_lex},
and so on. In C++ parsers, it is only the surrounding namespace which is
named @var{prefix} instead of @samp{yy}.
@xref{Multiple Parsers, ,Multiple Parsers in the Same Program}.
@end deffn
@end ifset
-@deffn {Directive} %no-parser
-Do not include any C code in the parser file; generate tables only. The
-parser file contains just @code{#define} directives and static variable
-declarations.
-
-This option also tells Bison to write the C code for the grammar actions
-into a file named @file{@var{file}.act}, in the form of a
-brace-surrounded body fit for a @code{switch} statement.
-@end deffn
-
@deffn {Directive} %no-lines
Don't generate any @code{#line} preprocessor commands in the parser
file. Ordinarily Bison writes these commands in the parser file so that
file in its own right.
@end deffn
-@deffn {Directive} %output="@var{file}"
+@deffn {Directive} %output "@var{file}"
Specify @var{file} for the parser file.
@end deffn
Require a Version of Bison}.
@end deffn
+@deffn {Directive} %skeleton "@var{file}"
+Specify the skeleton to use.
+
+You probably don't need this option unless you are developing Bison.
+You should use @code{%language} if you want to specify the skeleton for a
+different language, because it is clearer and because it will always choose the
+correct skeleton for non-deterministic or push parsers.
+
+If @var{file} does not contain a @code{/}, @var{file} is the name of a skeleton
+file in the Bison installation directory.
+If it does, @var{file} is an absolute file name or a file name relative to the
+directory of the grammar file.
+This is similar to how most shells resolve commands.
+@end deffn
+
@deffn {Directive} %token-table
Generate an array of token names in the parser file. The name of the
array is @code{yytname}; @code{yytname[@var{i}]} is the name of the
Bison parsers are @dfn{shift/reduce automata}. In some cases (much more
frequent than one would hope), looking at this automaton is required to
tune or simply fix a parser. Bison provides two different
-representation of it, either textually or graphically (as a @acronym{VCG}
-file).
+representation of it, either textually or graphically (as a DOT file).
The textual file is generated when the options @option{--report} or
@option{--verbose} are specified, see @xref{Invocation, , Invoking
The trace facility outputs messages with macro calls of the form
@code{YYFPRINTF (stderr, @var{format}, @var{args})} where
-@var{format} and @var{args} are the usual @code{printf} format and
+@var{format} and @var{args} are the usual @code{printf} format and variadic
arguments. If you define @code{YYDEBUG} to a nonzero value but do not
define @code{YYFPRINTF}, @code{<stdio.h>} is automatically included
-and @code{YYPRINTF} is defined to @code{fprintf}.
+and @code{YYFPRINTF} is defined to @code{fprintf}.
Once you have compiled the program with trace facilities, the way to
request a trace is to store a nonzero value in the variable @code{yydebug}.
Tuning the parser:
@table @option
-@item -S @var{file}
-@itemx --skeleton=@var{file}
-Specify the skeleton to use. You probably don't need this option unless
-you are developing Bison.
-
@item -t
@itemx --debug
In the parser file, define the macro @code{YYDEBUG} to 1 if it is not
already defined, so that the debugging facilities are compiled.
@xref{Tracing, ,Tracing Your Parser}.
+@item -L @var{language}
+@itemx --language=@var{language}
+Specify the programming language for the generated parser, as if
+@code{%language} was specified (@pxref{Decl Summary, , Bison Declaration
+Summary}). Currently supported languages include C and C++.
+@var{language} is case-insensitive.
+
@item --locations
Pretend that @code{%locations} was specified. @xref{Decl Summary}.
@item -p @var{prefix}
@itemx --name-prefix=@var{prefix}
-Pretend that @code{%name-prefix="@var{prefix}"} was specified.
+Pretend that @code{%name-prefix "@var{prefix}"} was specified.
@xref{Decl Summary}.
@item -l
grammar file. This option causes them to associate errors with the
parser file, treating it as an independent source file in its own right.
-@item -n
-@itemx --no-parser
-Pretend that @code{%no-parser} was specified. @xref{Decl Summary}.
+@item -S @var{file}
+@itemx --skeleton=@var{file}
+Specify the skeleton to use, similar to @code{%skeleton}
+(@pxref{Decl Summary, , Bison Declaration Summary}).
+
+You probably don't need this option unless you are developing Bison.
+You should use @option{--language} if you want to specify the skeleton for a
+different language, because it is clearer and because it will always
+choose the correct skeleton for non-deterministic or push parsers.
+
+If @var{file} does not contain a @code{/}, @var{file} is the name of a skeleton
+file in the Bison installation directory.
+If it does, @var{file} is an absolute file name or a file name relative to the
+current working directory.
+This is similar to how most shells resolve commands.
@item -k
@itemx --token-table
@item -b @var{file-prefix}
@itemx --file-prefix=@var{prefix}
-Pretend that @code{%file-prefix} was specified, i.e, specify prefix to use
+Pretend that @code{%file-prefix} was specified, i.e., specify prefix to use
for all Bison output file names. @xref{Decl Summary}.
@item -r @var{things}
@item -v
@itemx --verbose
-Pretend that @code{%verbose} was specified, i.e, write an extra output
+Pretend that @code{%verbose} was specified, i.e., write an extra output
file containing verbose descriptions of the grammar and
parser. @xref{Decl Summary}.
described under the @samp{-v} and @samp{-d} options.
@item -g
-Output a @acronym{VCG} definition of the @acronym{LALR}(1) grammar
-automaton computed by Bison. If the grammar file is @file{foo.y}, the
-@acronym{VCG} output file will
-be @file{foo.vcg}.
+Output a graphical representation of the @acronym{LALR}(1) grammar
+automaton computed by Bison, in @uref{http://www.graphviz.org/, Graphviz}
+@uref{http://www.graphviz.org/doc/info/lang.html, @acronym{DOT}} format.
+If the grammar file is @file{foo.y}, the output file will
+be @file{foo.dot}.
@item --graph=@var{graph-file}
The behavior of @var{--graph} is the same than @samp{-g}. The only
@item @option{--help} @tab @option{-h}
@item @option{--name-prefix=@var{prefix}} @tab @option{-p @var{name-prefix}}
@item @option{--no-lines} @tab @option{-l}
-@item @option{--no-parser} @tab @option{-n}
@item @option{--output=@var{outfile}} @tab @option{-o @var{outfile}}
@item @option{--print-localedir} @tab
@item @option{--token-table} @tab @option{-k}
@c ================================================= C++ Bison
-@node C++ Language Interface
-@chapter C++ Language Interface
+@node Other Languages
+@chapter Parsers Written In Other Languages
@menu
* C++ Parsers:: The interface to generate C++ parser classes
-* A Complete C++ Example:: Demonstrating their use
+* Java Parsers:: The interface to generate Java parser classes
@end menu
@node C++ Parsers
* C++ Location Values:: The position and location classes
* C++ Parser Interface:: Instantiating and running the parser
* C++ Scanner Interface:: Exchanges between yylex and parse
+* A Complete C++ Example:: Demonstrating their use
@end menu
@node C++ Bison Interface
@subsection C++ Bison Interface
-@c - %skeleton "lalr1.cc"
+@c - %language "C++"
@c - Always pure
@c - initial action
-The C++ parser @acronym{LALR}(1) skeleton is named @file{lalr1.cc}. To
-select it, you may either pass the option @option{--skeleton=lalr1.cc}
-to Bison, or include the directive @samp{%skeleton "lalr1.cc"} in the
-grammar preamble. When run, @command{bison} will create several
+The C++ @acronym{LALR}(1) parser is selected using the language directive,
+@samp{%language "C++"}, or the synonymous command-line option
+@option{--language=c++}.
+@xref{Decl Summary}.
+
+When run, @command{bison} will create several
entities in the @samp{yy} namespace. Use the @samp{%name-prefix}
directive to change the namespace name, see @ref{Decl Summary}. The
various classes are generated in the following files:
@node C++ Semantic Values
@subsection C++ Semantic Values
@c - No objects in unions
-@c - YSTYPE
+@c - YYSTYPE
@c - Printer and destructor
The @code{%union} directive works as for C, see @ref{Union Decl, ,The
@c - %locations
@c - class Position
@c - class Location
-@c - %define "filename_type" "const symbol::Symbol"
+@c - %define filename_type "const symbol::Symbol"
When the directive @code{%locations} is used, the C++ parser supports
location tracking, see @ref{Locations, , Locations Overview}. Two
The name of the file. It will always be handled as a pointer, the
parser will never duplicate nor deallocate it. As an experimental
feature you may change it to @samp{@var{type}*} using @samp{%define
-"filename_type" "@var{type}"}.
+filename_type "@var{type}"}.
@end deftypemethod
@deftypemethod {position} {unsigned int} line
The output files @file{@var{output}.hh} and @file{@var{output}.cc}
declare and define the parser class in the namespace @code{yy}. The
class name defaults to @code{parser}, but may be changed using
-@samp{%define "parser_class_name" "@var{name}"}. The interface of
+@samp{%define parser_class_name "@var{name}"}. The interface of
this class is detailed below. It can be extended using the
@code{%parse-param} feature: its semantics is slightly changed since
it describes an additional member of the parser class, and an
@node A Complete C++ Example
-@section A Complete C++ Example
+@subsection A Complete C++ Example
This section demonstrates the use of a C++ parser with a simple but
complete example. This example should be available on your system,
@end menu
@node Calc++ --- C++ Calculator
-@subsection Calc++ --- C++ Calculator
+@subsubsection Calc++ --- C++ Calculator
Of course the grammar is dedicated to arithmetics, a single
expression, possibly preceded by variable assignments. An
@end example
@node Calc++ Parsing Driver
-@subsection Calc++ Parsing Driver
+@subsubsection Calc++ Parsing Driver
@c - An env
@c - A place to store error messages
@c - A place for the result
@comment file: calc++-driver.hh
@example
-// Announce to Flex the prototype we want for lexing function, ...
-# define YY_DECL \
+// Tell Flex the lexer's prototype ...
+# define YY_DECL \
yy::calcxx_parser::token_type \
yylex (yy::calcxx_parser::semantic_type* yylval, \
yy::calcxx_parser::location_type* yylloc, \
@noindent
To encapsulate the coordination with the Flex scanner, it is useful to
have two members function to open and close the scanning phase.
-members.
@comment file: calc++-driver.hh
@example
@comment file: calc++-driver.hh
@example
- // Handling the parser.
- void parse (const std::string& f);
+ // Run the parser. Return 0 on success.
+ int parse (const std::string& f);
std::string file;
bool trace_parsing;
@end example
@{
@}
-void
+int
calcxx_driver::parse (const std::string &f)
@{
file = f;
scan_begin ();
yy::calcxx_parser parser (*this);
parser.set_debug_level (trace_parsing);
- parser.parse ();
+ int res = parser.parse ();
scan_end ();
+ return res;
@}
void
@end example
@node Calc++ Parser
-@subsection Calc++ Parser
+@subsubsection Calc++ Parser
The parser definition file @file{calc++-parser.yy} starts by asking for
the C++ LALR(1) skeleton, the creation of the parser header file, and
@comment file: calc++-parser.yy
@example
-%skeleton "lalr1.cc" /* -*- C++ -*- */
-%require "2.1a"
+%language "C++" /* -*- C++ -*- */
+%require "@value{VERSION}"
%defines
-%define "parser_class_name" "calcxx_parser"
+%define parser_class_name "calcxx_parser"
@end example
@noindent
+@findex %code requires
Then come the declarations/inclusions needed to define the
@code{%union}. Because the parser uses the parsing driver and
reciprocally, both cannot include the header of the other. Because the
driver's header needs detailed knowledge about the parser class (in
particular its inner types), it is the parser's header which will simply
use a forward declaration of the driver.
+@xref{Decl Summary, ,%code}.
@comment file: calc++-parser.yy
@example
-%@{
+%code requires @{
# include <string>
class calcxx_driver;
-%@}
+@}
@end example
@noindent
@end example
@noindent
-The code between @samp{%@{} and @samp{%@}} after the introduction of the
-@samp{%union} is output in the @file{*.cc} file; it needs detailed
-knowledge about the driver.
+@findex %code
+The code between @samp{%code @{} and @samp{@}} is output in the
+@file{*.cc} file; it needs detailed knowledge about the driver.
@comment file: calc++-parser.yy
@example
-%@{
+%code @{
# include "calc++-driver.hh"
-%@}
+@}
@end example
%token ASSIGN ":="
%token <sval> IDENTIFIER "identifier"
%token <ival> NUMBER "number"
-%type <ival> exp "expression"
+%type <ival> exp
@end example
@noindent
%printer @{ debug_stream () << *$$; @} "identifier"
%destructor @{ delete $$; @} "identifier"
-%printer @{ debug_stream () << $$; @} "number" "expression"
+%printer @{ debug_stream () << $$; @} <ival>
@end example
@noindent
assignments: assignments assignment @{@}
| /* Nothing. */ @{@};
-assignment: "identifier" ":=" exp @{ driver.variables[*$1] = $3; @};
+assignment:
+ "identifier" ":=" exp
+ @{ driver.variables[*$1] = $3; delete $1; @};
%left '+' '-';
%left '*' '/';
| exp '-' exp @{ $$ = $1 - $3; @}
| exp '*' exp @{ $$ = $1 * $3; @}
| exp '/' exp @{ $$ = $1 / $3; @}
- | "identifier" @{ $$ = driver.variables[*$1]; @}
+ | "identifier" @{ $$ = driver.variables[*$1]; delete $1; @}
| "number" @{ $$ = $1; @};
%%
@end example
@end example
@node Calc++ Scanner
-@subsection Calc++ Scanner
+@subsubsection Calc++ Scanner
The Flex scanner first includes the driver declaration, then the
parser's to get the set of defined tokens.
calcxx_driver::scan_begin ()
@{
yy_flex_debug = trace_scanning;
- if (!(yyin = fopen (file.c_str (), "r")))
- error (std::string ("cannot open ") + file);
+ if (file == "-")
+ yyin = stdin;
+ else if (!(yyin = fopen (file.c_str (), "r")))
+ @{
+ error (std::string ("cannot open ") + file);
+ exit (1);
+ @}
@}
void
@end example
@node Calc++ Top Level
-@subsection Calc++ Top Level
+@subsubsection Calc++ Top Level
The top level file, @file{calc++.cc}, poses no problem.
driver.trace_parsing = true;
else if (*argv == std::string ("-s"))
driver.trace_scanning = true;
- else
- @{
- driver.parse (*argv);
- std::cout << driver.result << std::endl;
- @}
+ else if (!driver.parse (*argv))
+ std::cout << driver.result << std::endl;
@}
@end example
+@node Java Parsers
+@section Java Parsers
+
+@menu
+* Java Bison Interface:: Asking for Java parser generation
+* Java Semantic Values:: %type and %token vs. Java
+* Java Location Values:: The position and location classes
+* Java Parser Interface:: Instantiating and running the parser
+* Java Scanner Interface:: Java scanners, and pure parsers
+* Java Differences:: Differences between C/C++ and Java Grammars
+@end menu
+
+@node Java Bison Interface
+@subsection Java Bison Interface
+@c - %language "Java"
+@c - initial action
+
+The Java parser skeletons are selected using a language directive,
+@samp{%language "Java"}, or the synonymous command-line option
+@option{--language=java}.
+
+When run, @command{bison} will create several entities whose name
+starts with @samp{YY}. Use the @samp{%name-prefix} directive to
+change the prefix, see @ref{Decl Summary}; classes can be placed
+in an arbitrary Java package using a @samp{%define package} section.
+
+The parser class defines an inner class, @code{Location}, that is used
+for location tracking. If the parser is pure, it also defines an
+inner interface, @code{Lexer}; see~@ref{Java Scanner Interface} for the
+meaning of pure parsers when the Java language is chosen. Other than
+these inner class/interface, and the members described in~@ref{Java
+Parser Interface}, all the other members and fields are preceded
+with a @code{yy} prefix to avoid clashes with user code.
+
+No header file can be generated for Java parsers; you must not pass
+@option{-d}/@option{--defines} to @command{bison}, nor use the
+@samp{%defines} directive.
+
+By default, the @samp{YYParser} class has package visibility. A
+declaration @samp{%define "public"} will change to public visibility.
+Remember that, according to the Java language specification, the name
+of the @file{.java} file should match the name of the class in this
+case.
+
+Similarly, a declaration @samp{%define "abstract"} will make your
+class abstract.
+
+You can create documentation for generated parsers using Javadoc.
+
+@node Java Semantic Values
+@subsection Java Semantic Values
+@c - No %union, specify type in %type/%token.
+@c - YYSTYPE
+@c - Printer and destructor
+
+There is no @code{%union} directive in Java parsers. Instead, the
+semantic values' types (class names) should be specified in the
+@code{%type} or @code{%token} directive:
+
+@example
+%type <Expression> expr assignment_expr term factor
+%type <Integer> number
+@end example
+
+By default, the semantic stack is declared to have @code{Object} members,
+which means that the class types you specify can be of any class.
+To improve the type safety of the parser, you can declare the common
+superclass of all the semantic values using the @samp{%define} directive.
+For example, after the following declaration:
+
+@example
+%define "stype" "ASTNode"
+@end example
+
+@noindent
+any @code{%type} or @code{%token} specifying a semantic type which
+is not a subclass of ASTNode, will cause a compile-time error.
+
+Types used in the directives may be qualified with a package name.
+Primitive data types are accepted for Java version 1.5 or later. Note
+that in this case the autoboxing feature of Java 1.5 will be used.
+
+Java parsers do not support @code{%destructor}, since the language
+adopts garbage collection. The parser will try to hold references
+to semantic values for as little time as needed.
+
+Java parsers do not support @code{%printer}, as @code{toString()}
+can be used to print the semantic values. This however may change
+(in a backwards-compatible way) in future versions of Bison.
+
+
+@node Java Location Values
+@subsection Java Location Values
+@c - %locations
+@c - class Position
+@c - class Location
+
+When the directive @code{%locations} is used, the Java parser
+supports location tracking, see @ref{Locations, , Locations Overview}.
+An auxiliary user-defined class defines a @dfn{position}, a single point
+in a file; Bison itself defines a class representing a @dfn{location},
+a range composed of a pair of positions (possibly spanning several
+files). The location class is an inner class of the parser; the name
+is @code{Location} by default, may also be renamed using @code{%define
+"location_type" "@var{class-name}}.
+
+The location class treats the position as a completely opaque value.
+By default, the class name is @code{Position}, but this can be changed
+with @code{%define "position_type" "@var{class-name}"}.
+
+
+@deftypemethod {Location} {Position} begin
+@deftypemethodx {Location} {Position} end
+The first, inclusive, position of the range, and the first beyond.
+@end deftypemethod
+
+@deftypemethod {Location} {void} toString ()
+Prints the range represented by the location. For this to work
+properly, the position class should override the @code{equals} and
+@code{toString} methods appropriately.
+@end deftypemethod
+
+
+@node Java Parser Interface
+@subsection Java Parser Interface
+@c - define parser_class_name
+@c - Ctor
+@c - parse, error, set_debug_level, debug_level, set_debug_stream,
+@c debug_stream.
+@c - Reporting errors
+
+The output file defines the parser class in the package optionally
+indicated in the @code{%define package} section. The class name defaults
+to @code{YYParser}. The @code{YY} prefix may be changed using
+@samp{%name-prefix}; alternatively, you can use @samp{%define
+"parser_class_name" "@var{name}"} to give a custom name to the class.
+The interface of this class is detailed below. It can be extended using
+the @code{%parse-param} directive; each occurrence of the directive will
+add a field to the parser class, and an argument to its constructor.
+
+@deftypemethod {YYParser} {} YYParser (@var{type1} @var{arg1}, ...)
+Build a new parser object. There are no arguments by default, unless
+@samp{%parse-param @{@var{type1} @var{arg1}@}} was used.
+@end deftypemethod
+
+@deftypemethod {YYParser} {boolean} parse ()
+Run the syntactic analysis, and return @code{true} on success,
+@code{false} otherwise.
+@end deftypemethod
+
+@deftypemethod {YYParser} {boolean} recovering ()
+During the syntactic analysis, return @code{true} if recovering
+from a syntax error. @xref{Error Recovery}.
+@end deftypemethod
+
+@deftypemethod {YYParser} {java.io.PrintStream} getDebugStream ()
+@deftypemethodx {YYParser} {void} setDebugStream (java.io.printStream @var{o})
+Get or set the stream used for tracing the parsing. It defaults to
+@code{System.err}.
+@end deftypemethod
+
+@deftypemethod {YYParser} {int} getDebugLevel ()
+@deftypemethodx {YYParser} {void} setDebugLevel (int @var{l})
+Get or set the tracing level. Currently its value is either 0, no trace,
+or nonzero, full tracing.
+@end deftypemethod
+
+@deftypemethod {YYParser} {void} error (Location @var{l}, String @var{m})
+The definition for this member function must be supplied by the user
+in the same way as the scanner interface (@pxref{Java Scanner
+Interface}); the parser uses it to report a parser error occurring at
+@var{l}, described by @var{m}.
+@end deftypemethod
+
+
+@node Java Scanner Interface
+@subsection Java Scanner Interface
+@c - %code lexer
+@c - %lex-param
+@c - Lexer interface
+
+Contrary to C parsers, Java parsers do not use global variables; the
+state of the parser is always local to an instance of the parser class.
+Therefore, all Java parsers are ``pure'', and the @code{%pure-parser}
+directive does not do anything when used in Java.
+
+The scanner always resides in a separate class than the parser.
+Still, Java also two possible ways to interface a Bison-generated Java
+parser with a scanner, that is, the scanner may reside in a separate file
+than the Bison grammar, or in the same file. The interface
+to the scanner is similar in the two cases.
+
+In the first case, where the scanner in the same file as the grammar, the
+scanner code has to be placed in @code{%code lexer} blocks. If you want
+to pass parameters from the parser constructor to the scanner constructor,
+specify them with @code{%lex-param}; they are passed before
+@code{%parse-param}s to the constructor.
+
+In the second case, the scanner has to implement interface @code{Lexer},
+which is defined within the parser class (e.g., @code{YYParser.Lexer}).
+The constructor of the parser object will then accept an object
+implementing the interface; @code{%lex-param} is not used in this
+case.
+
+In both cases, the scanner has to implement the following methods.
+
+@deftypemethod {Lexer} {void} yyerror (Location @var{l}, String @var{m})
+As explained in @pxref{Java Parser Interface}, this method is defined
+by the user to emit an error message. The first parameter is omitted
+if location tracking is not active. Its type can be changed using
+@samp{%define "location_type" "@var{class-name}".}
+@end deftypemethod
+
+@deftypemethod {Lexer} {int} yylex (@var{type1} @var{arg1}, ...)
+Return the next token. Its type is the return value, its semantic
+value and location are saved and returned by the ther methods in the
+interface. Invocations of @samp{%lex-param @{@var{type1}
+@var{arg1}@}} yield additional arguments.
+@end deftypemethod
+
+@deftypemethod {Lexer} {Position} getStartPos ()
+@deftypemethodx {Lexer} {Position} getEndPos ()
+Return respectively the first position of the last token that
+@code{yylex} returned, and the first position beyond it. These
+methods are not needed unless location tracking is active.
+
+The return type can be changed using @samp{%define "position_type"
+"@var{class-name}".}
+@end deftypemethod
+
+@deftypemethod {Lexer} {Object} getLVal ()
+Return respectively the first position of the last token that yylex
+returned, and the first position beyond it.
+
+The return type can be changed using @samp{%define "stype"
+"@var{class-name}".}
+@end deftypemethod
+
+
+If @code{%pure-parser} is not specified, the lexer interface
+resides in the same class (@code{YYParser}) as the Bison-generated
+parser. The fields and methods that are provided to
+this end are as follows.
+
+@deftypemethod {YYParser} {void} error (Location @var{l}, String @var{m})
+As explained in @pxref{Java Parser Interface}, this method is defined
+by the user to emit an error message. The first parameter is not used
+unless location tracking is active. Its type can be changed using
+@samp{%define "location_type" "@var{class-name}".}
+@end deftypemethod
+
+@deftypemethod {YYParser} {int} yylex (@var{type1} @var{arg1}, ...)
+Return the next token. Its type is the return value, its semantic
+value and location are saved into @code{yylval}, @code{yystartpos},
+@code{yyendpos}. Invocations of @samp{%lex-param @{@var{type1}
+@var{arg1}@}} yield additional arguments.
+@end deftypemethod
+
+@deftypecv {Field} {YYParser} Position yystartpos
+@deftypecvx {Field} {YYParser} Position yyendpos
+Contain respectively the first position of the last token that yylex
+returned, and the first position beyond it. These methods are not
+needed unless location tracking is active.
+
+The field's type can be changed using @samp{%define "position_type"
+"@var{class-name}".}
+@end deftypecv
+
+@deftypecv {Field} {YYParser} Object yylval
+Return respectively the first position of the last token that yylex
+returned, and the first position beyond it.
+
+The field's type can be changed using @samp{%define "stype"
+"@var{class-name}".}
+@end deftypecv
+
+@node Java Differences
+@subsection Differences between C/C++ and Java Grammars
+
+The different structure of the Java language forces several differences
+between C/C++ grammars, and grammars designed for Java parsers. This
+section summarizes these differences.
+
+@itemize
+@item
+Java lacks a preprocessor, so the @code{YYERROR}, @code{YYACCEPT},
+@code{YYABORT} symbols (@pxref{Table of Symbols}) cannot obviously be
+macros. Instead, they should be preceded by @code{return} when they
+appear in an action. The actual definition of these symbols is
+opaque to the Bison grammar, and it might change in the future. The
+only meaningful operation that you can do, is to return them.
+
+Note that of these three symbols, only @code{YYACCEPT} and
+@code{YYABORT} will cause a return from the @code{yyparse}
+method@footnote{Java parsers include the actions in a separate
+method than @code{yyparse} in order to have an intuitive syntax that
+corresponds to these C macros.}.
+
+@item
+The prolog declarations have a different meaning than in C/C++ code.
+@table @asis
+@item @code{%code imports}
+blocks are placed at the beginning of the Java source code. They may
+include copyright notices. For a @code{package} declarations, it is
+suggested to use @code{%define package} instead.
+
+@item unqualified @code{%code}
+blocks are placed inside the parser class.
+
+@item @code{%code lexer}
+blocks, if specified, should include the implementation of the
+scanner. If there is no such block, the scanner can be any class
+that implements the appropriate interface (see @pxref{Java Scanner
+Interface}).
+@end table
+
+Other @code{%code} blocks are not supported in Java parsers.
+The epilogue has the same meaning as in C/C++ code and it can
+be used to define other classes used by the parser.
+@end itemize
+
@c ================================================= FAQ
@node FAQ
* I can't build Bison:: Troubleshooting
* Where can I find help?:: Troubleshouting
* Bug Reports:: Troublereporting
-* Other Languages:: Parsers in Java and others
+* More Languages:: Parsers in C++, Java, and so on
* Beta Testing:: Experimenting development versions
* Mailing Lists:: Meeting other Bison users
@end menu
Send bug reports to @email{bug-bison@@gnu.org}.
-@node Other Languages
-@section Other Languages
+@node More Languages
+@section More Languages
@display
-Will Bison ever have C++ support? How about Java or @var{insert your
+Will Bison ever have C++ and Java support? How about @var{insert your
favorite language here}?
@end display
-C++ support is there now, and is documented. We'd love to add other
+C++ and Java support is there now, and is documented. We'd love to add other
languages; contributions are welcome.
@node Beta Testing
@xref{Rules, ,Syntax of Grammar Rules}.
@end deffn
+@deffn {Directive} <*>
+Used to define a default tagged @code{%destructor} or default tagged
+@code{%printer}.
+
+This feature is experimental.
+More user feedback will help to determine whether it should become a permanent
+feature.
+
+@xref{Destructor Decl, , Freeing Discarded Symbols}.
+@end deffn
+
+@deffn {Directive} <>
+Used to define a default tagless @code{%destructor} or default tagless
+@code{%printer}.
+
+This feature is experimental.
+More user feedback will help to determine whether it should become a permanent
+feature.
+
+@xref{Destructor Decl, , Freeing Discarded Symbols}.
+@end deffn
+
@deffn {Symbol} $accept
The predefined nonterminal whose only rule is @samp{$accept: @var{start}
$end}, where @var{start} is the start symbol. @xref{Start Decl, , The
Start-Symbol}. It cannot be used in the grammar.
@end deffn
+@deffn {Directive} %code @{@var{code}@}
+@deffnx {Directive} %code @var{qualifier} @{@var{code}@}
+Insert @var{code} verbatim into output parser source.
+@xref{Decl Summary,,%code}.
+@end deffn
+
+@deffn {Directive} %debug
+Equip the parser for debugging. @xref{Decl Summary}.
+@end deffn
+
@deffn {Directive} %debug
Equip the parser for debugging. @xref{Decl Summary}.
@end deffn
@end deffn
@end ifset
+@deffn {Directive} %define @var{define-variable}
+@deffnx {Directive} %define @var{define-variable} @var{value}
+Define a variable to adjust Bison's behavior.
+@xref{Decl Summary,,%define}.
+@end deffn
+
@deffn {Directive} %defines
Bison declaration to create a header file meant for the scanner.
@xref{Decl Summary}.
@end deffn
+@deffn {Directive} %defines @var{defines-file}
+Same as above, but save in the file @var{defines-file}.
+@xref{Decl Summary}.
+@end deffn
+
@deffn {Directive} %destructor
Specify how the parser should reclaim the memory associated to
discarded symbols. @xref{Destructor Decl, , Freeing Discarded Symbols}.
when @code{yyerror} is called.
@end deffn
-@deffn {Directive} %file-prefix="@var{prefix}"
+@deffn {Directive} %file-prefix "@var{prefix}"
Bison declaration to set the prefix of the output files. @xref{Decl
Summary}.
@end deffn
Run user code before parsing. @xref{Initial Action Decl, , Performing Actions before Parsing}.
@end deffn
+@deffn {Directive} %language
+Specify the programming language for the generated parser.
+@xref{Decl Summary}.
+@end deffn
+
@deffn {Directive} %left
Bison declaration to assign left associativity to token(s).
@xref{Precedence Decl, ,Operator Precedence}.
@xref{GLR Parsers, ,Writing @acronym{GLR} Parsers}.
@end deffn
-@deffn {Directive} %name-prefix="@var{prefix}"
+@deffn {Directive} %name-prefix "@var{prefix}"
Bison declaration to rename the external symbols. @xref{Decl Summary}.
@end deffn
@xref{Precedence Decl, ,Operator Precedence}.
@end deffn
-@deffn {Directive} %output="@var{file}"
+@deffn {Directive} %output "@var{file}"
Bison declaration to set the name of the parser file. @xref{Decl
Summary}.
@end deffn
@xref{Precedence Decl, ,Operator Precedence}.
@end deffn
+@deffn {Directive} %skeleton
+Specify the skeleton to use; usually for development.
+@xref{Decl Summary}.
+@end deffn
+
@deffn {Directive} %start
Bison declaration to specify the start symbol. @xref{Start Decl, ,The
Start-Symbol}.
making @code{yyparse} return 1 immediately. The error reporting
function @code{yyerror} is not called. @xref{Parser Function, ,The
Parser Function @code{yyparse}}.
+
+For Java parsers, this functionality is invoked using @code{return YYABORT;}
+instead.
@end deffn
@deffn {Macro} YYACCEPT
Macro to pretend that a complete utterance of the language has been
read, by making @code{yyparse} return 0 immediately.
@xref{Parser Function, ,The Parser Function @code{yyparse}}.
+
+For Java parsers, this functionality is invoked using @code{return YYACCEPT;}
+instead.
@end deffn
@deffn {Macro} YYBACKUP
@code{yyerror} and then perform normal error recovery if possible
(@pxref{Error Recovery}), or (if recovery is impossible) make
@code{yyparse} return 1. @xref{Error Recovery}.
+
+For Java parsers, this functionality is invoked using @code{return YYERROR;}
+instead.
@end deffn
@deffn {Function} yyerror
@c LocalWords: pre STDC GNUC endif yy YY alloca lf stddef stdlib YYDEBUG
@c LocalWords: NUM exp subsubsection kbd Ctrl ctype EOF getchar isdigit
@c LocalWords: ungetc stdin scanf sc calc ulator ls lm cc NEG prec yyerrok
-@c LocalWords: longjmp fprintf stderr preg yylloc YYLTYPE cos ln
+@c LocalWords: longjmp fprintf stderr yylloc YYLTYPE cos ln
@c LocalWords: smallexample symrec val tptr FNCT fnctptr func struct sym
@c LocalWords: fnct putsym getsym fname arith fncts atan ptr malloc sizeof
@c LocalWords: strlen strcpy fctn strcmp isalpha symbuf realloc isalnum
@c LocalWords: ptypes itype YYPRINT trigraphs yytname expseq vindex dtype
-@c LocalWords: Rhs YYRHSLOC LE nonassoc op deffn typeless typefull yynerrs
+@c LocalWords: Rhs YYRHSLOC LE nonassoc op deffn typeless yynerrs
@c LocalWords: yychar yydebug msg YYNTOKENS YYNNTS YYNRULES YYNSTATES
@c LocalWords: cparse clex deftypefun NE defmac YYACCEPT YYABORT param
@c LocalWords: strncmp intval tindex lvalp locp llocp typealt YYBACKUP
@c LocalWords: YYEMPTY YYEOF YYRECOVERING yyclearin GE def UMINUS maybeword
@c LocalWords: Johnstone Shamsa Sadaf Hussain Tomita TR uref YYMAXDEPTH
-@c LocalWords: YYINITDEPTH stmnts ref stmnt initdcl maybeasm VCG notype
+@c LocalWords: YYINITDEPTH stmnts ref stmnt initdcl maybeasm notype
@c LocalWords: hexflag STR exdent itemset asis DYYDEBUG YYFPRINTF args
-@c LocalWords: YYPRINTF infile ypp yxx outfile itemx vcg tex leaderfill
+@c LocalWords: infile ypp yxx outfile itemx tex leaderfill
@c LocalWords: hbox hss hfill tt ly yyin fopen fclose ofirst gcc ll
-@c LocalWords: yyrestart nbar yytext fst snd osplit ntwo strdup AST
+@c LocalWords: nbar yytext fst snd osplit ntwo strdup AST
@c LocalWords: YYSTACK DVI fdl printindex