X-Git-Url: https://git.saurik.com/bison.git/blobdiff_plain/bb32f4f284d5e86ad24f0684e16e19243c1957f2..31984206a710101776d1db64009aff6c7962551c:/doc/bison.texinfo diff --git a/doc/bison.texinfo b/doc/bison.texinfo index 0a9b83d4..a706a96d 100644 --- a/doc/bison.texinfo +++ b/doc/bison.texinfo @@ -104,7 +104,7 @@ Reference sections: messy for Bison to handle straightforwardly. * Debugging:: Understanding or debugging Bison parsers. * Invocation:: How to run Bison (to produce the parser source file). -* C++ Language Interface:: Creating C++ parser objects. +* Other Languages:: Creating C++ and Java parsers. * FAQ:: Frequently Asked Questions * Table of Symbols:: All the keywords of the Bison language are explained. * Glossary:: Basic concepts are explained. @@ -224,6 +224,7 @@ Bison Declarations * Expect Decl:: Suppressing warnings about parsing conflicts. * Start Decl:: Specifying the start symbol. * Pure Decl:: Requesting a reentrant parser. +* Push Decl:: Requesting a push parser. * Decl Summary:: Table of all Bison declarations. Parser C-Language Interface @@ -285,10 +286,10 @@ Invoking Bison * Option Cross Key:: Alphabetical list of long options. * Yacc Library:: Yacc-compatible @code{yylex} and @code{main}. -C++ Language Interface +Parsers Written In Other Languages * C++ Parsers:: The interface to generate C++ parser classes -* A Complete C++ Example:: Demonstrating their use +* Java Parsers:: The interface to generate Java parser classes C++ Parsers @@ -297,6 +298,7 @@ C++ Parsers * C++ Location Values:: The position and location classes * C++ Parser Interface:: Instantiating and running the parser * C++ Scanner Interface:: Exchanges between yylex and parse +* A Complete C++ Example:: Demonstrating their use A Complete C++ Example @@ -306,6 +308,15 @@ A Complete C++ Example * Calc++ Scanner:: A pure C++ Flex scanner * Calc++ Top Level:: Conducting the band +Java Parsers + +* Java Bison Interface:: Asking for Java parser generation +* Java Semantic Values:: %type and %token vs. Java +* Java Location Values:: The position and location classes +* Java Parser Interface:: Instantiating and running the parser +* Java Scanner Interface:: Java scanners, and pure parsers +* Java Differences:: Differences between C/C++ and Java Grammars + Frequently Asked Questions * Memory Exhausted:: Breaking the Stack Limits @@ -323,7 +334,7 @@ Frequently Asked Questions Copying This Manual -* GNU Free Documentation License:: License for copying this manual. +* Copying This Manual:: License for copying this manual. @end detailmenu @end menu @@ -391,7 +402,9 @@ inspecting the file for text beginning with ``As a special exception@dots{}''. The text spells out the exact terms of the exception. -@include gpl.texi +@node Copying +@unnumbered GNU GENERAL PUBLIC LICENSE +@include gpl-3.0.texi @node Concepts @chapter The Concepts of Bison @@ -2694,8 +2707,8 @@ As an alternative, Bison provides a %code directive with an explicit qualifier field, which identifies the purpose of the code and thus the location(s) where Bison should generate it. For C/C++, the qualifier can be omitted for the default location, or it can be -@code{requires}, @code{provides}, or @code{top}. -@xref{Table of Symbols,,Bison Symbols}. +one of @code{requires}, @code{provides}, @code{top}. +@xref{Decl Summary,,%code}. Look again at the example of the previous section: @@ -2793,8 +2806,8 @@ parser source code file. The first line after the warning is required by @code{YYSTYPE} and thus also needs to appear in the parser source code file. However, if you've instructed Bison to generate a parser header file -(@pxref{Table of Symbols, ,%defines}), you probably want that line to appear -before the @code{YYSTYPE} definition in that header file as well. +(@pxref{Decl Summary, ,%defines}), you probably want that line to appear before +the @code{YYSTYPE} definition in that header file as well. The @code{YYLTYPE} definition should also appear in the parser header file to override the default @code{YYLTYPE} definition there. @@ -2946,6 +2959,8 @@ type: You could even place each of the above directive groups in the rules section of the grammar file next to the set of rules that uses the associated semantic type. +(In the rules section, you must terminate each of those directives with a +semicolon.) And you don't have to worry that some directive (like a @code{%union}) in the definitions section is going to adversely affect their functionality in some counter-intuitive manner just because it comes first. @@ -3966,6 +3981,7 @@ Grammars}). * Expect Decl:: Suppressing warnings about parsing conflicts. * Start Decl:: Specifying the start symbol. * Pure Decl:: Requesting a reentrant parser. +* Push Decl:: Requesting a push parser. * Decl Summary:: Table of all Bison declarations. @end menu @@ -4386,7 +4402,7 @@ The parser can @dfn{return immediately} because of an explicit call to @code{YYABORT} or @code{YYACCEPT}, or failed error recovery, or memory exhaustion. -Right-hand size symbols of a rule that explicitly triggers a syntax +Right-hand side symbols of a rule that explicitly triggers a syntax error via @code{YYERROR} are not discarded automatically. As a rule of thumb, destructors are invoked only when user actions cannot manage the memory. @@ -4498,8 +4514,9 @@ The result is that the communication variables @code{yylval} and @code{yylloc} become local variables in @code{yyparse}, and a different calling convention is used for the lexical analyzer function @code{yylex}. @xref{Pure Calling, ,Calling Conventions for Pure -Parsers}, for the details of this. The variable @code{yynerrs} also -becomes local in @code{yyparse} (@pxref{Error Reporting, ,The Error +Parsers}, for the details of this. The variable @code{yynerrs} +becomes local in @code{yyparse} in pull mode but it becomes a member +of yypstate in push mode. (@pxref{Error Reporting, ,The Error Reporting Function @code{yyerror}}). The convention for calling @code{yyparse} itself is unchanged. @@ -4507,6 +4524,113 @@ Whether the parser is pure has nothing to do with the grammar rules. You can generate either a pure parser or a nonreentrant parser from any valid grammar. +@node Push Decl +@subsection A Push Parser +@cindex push parser +@cindex push parser +@findex %define push_pull + +A pull parser is called once and it takes control until all its input +is completely parsed. A push parser, on the other hand, is called +each time a new token is made available. + +A push parser is typically useful when the parser is part of a +main event loop in the client's application. This is typically +a requirement of a GUI, when the main event loop needs to be triggered +within a certain time period. + +Normally, Bison generates a pull parser. +The following Bison declaration says that you want the parser to be a push +parser (@pxref{Decl Summary,,%define push_pull}): + +@example +%define push_pull "push" +@end example + +In almost all cases, you want to ensure that your push parser is also +a pure parser (@pxref{Pure Decl, ,A Pure (Reentrant) Parser}). The only +time you should create an impure push parser is to have backwards +compatibility with the impure Yacc pull mode interface. Unless you know +what you are doing, your declarations should look like this: + +@example +%pure-parser +%define push_pull "push" +@end example + +There is a major notable functional difference between the pure push parser +and the impure push parser. It is acceptable for a pure push parser to have +many parser instances, of the same type of parser, in memory at the same time. +An impure push parser should only use one parser at a time. + +When a push parser is selected, Bison will generate some new symbols in +the generated parser. @code{yypstate} is a structure that the generated +parser uses to store the parser's state. @code{yypstate_new} is the +function that will create a new parser instance. @code{yypstate_delete} +will free the resources associated with the corresponding parser instance. +Finally, @code{yypush_parse} is the function that should be called whenever a +token is available to provide the parser. A trivial example +of using a pure push parser would look like this: + +@example +int status; +yypstate *ps = yypstate_new (); +do @{ + status = yypush_parse (ps, yylex (), NULL); +@} while (status == YYPUSH_MORE); +yypstate_delete (ps); +@end example + +If the user decided to use an impure push parser, a few things about +the generated parser will change. The @code{yychar} variable becomes +a global variable instead of a variable in the @code{yypush_parse} function. +For this reason, the signature of the @code{yypush_parse} function is +changed to remove the token as a parameter. A nonreentrant push parser +example would thus look like this: + +@example +extern int yychar; +int status; +yypstate *ps = yypstate_new (); +do @{ + yychar = yylex (); + status = yypush_parse (ps); +@} while (status == YYPUSH_MORE); +yypstate_delete (ps); +@end example + +That's it. Notice the next token is put into the global variable @code{yychar} +for use by the next invocation of the @code{yypush_parse} function. + +Bison also supports both the push parser interface along with the pull parser +interface in the same generated parser. In order to get this functionality, +you should replace the @code{%define push_pull "push"} declaration with the +@code{%define push_pull "both"} declaration. Doing this will create all of the +symbols mentioned earlier along with the two extra symbols, @code{yyparse} +and @code{yypull_parse}. @code{yyparse} can be used exactly as it normally +would be used. However, the user should note that it is implemented in the +generated parser by calling @code{yypull_parse}. +This makes the @code{yyparse} function that is generated with the +@code{%define push_pull "both"} declaration slower than the normal +@code{yyparse} function. If the user +calls the @code{yypull_parse} function it will parse the rest of the input +stream. It is possible to @code{yypush_parse} tokens to select a subgrammar +and then @code{yypull_parse} the rest of the input stream. If you would like +to switch back and forth between between parsing styles, you would have to +write your own @code{yypull_parse} function that knows when to quit looking +for input. An example of using the @code{yypull_parse} function would look +like this: + +@example +yypstate *ps = yypstate_new (); +yypull_parse (ps); /* Will call the lexer */ +yypstate_delete (ps); +@end example + +Adding the @code{%pure-parser} declaration does exactly the same thing to the +generated parser with @code{%define push_pull "both"} as it did for +@code{%define push_pull "push"}. + @node Decl Summary @subsection Bison Declaration Summary @cindex Bison declaration summary @@ -4569,12 +4693,261 @@ Declare the expected number of shift-reduce conflicts In order to change the behavior of @command{bison}, use the following directives: +@deffn {Directive} %code @{@var{code}@} +@findex %code +This is the unqualified form of the @code{%code} directive. +It inserts @var{code} verbatim at a language-dependent default location in the +output@footnote{The default location is actually skeleton-dependent; + writers of non-standard skeletons however should choose the default location + consistently with the behavior of the standard Bison skeletons.}. + +@cindex Prologue +For C/C++, the default location is the parser source code +file after the usual contents of the parser header file. +Thus, @code{%code} replaces the traditional Yacc prologue, +@code{%@{@var{code}%@}}, for most purposes. +For a detailed discussion, see @ref{Prologue Alternatives}. + +For Java, the default location is inside the parser class. + +(Like all the Yacc prologue alternatives, this directive is experimental. +More user feedback will help to determine whether it should become a permanent +feature.) +@end deffn + +@deffn {Directive} %code @var{qualifier} @{@var{code}@} +This is the qualified form of the @code{%code} directive. +If you need to specify location-sensitive verbatim @var{code} that does not +belong at the default location selected by the unqualified @code{%code} form, +use this form instead. + +@var{qualifier} identifies the purpose of @var{code} and thus the location(s) +where Bison should generate it. +Not all values of @var{qualifier} are available for all target languages: + +@itemize @bullet +@item requires +@findex %code requires + +@itemize @bullet +@item Language(s): C, C++ + +@item Purpose: This is the best place to write dependency code required for +@code{YYSTYPE} and @code{YYLTYPE}. +In other words, it's the best place to define types referenced in @code{%union} +directives, and it's the best place to override Bison's default @code{YYSTYPE} +and @code{YYLTYPE} definitions. + +@item Location(s): The parser header file and the parser source code file +before the Bison-generated @code{YYSTYPE} and @code{YYLTYPE} definitions. +@end itemize + +@item provides +@findex %code provides + +@itemize @bullet +@item Language(s): C, C++ + +@item Purpose: This is the best place to write additional definitions and +declarations that should be provided to other modules. + +@item Location(s): The parser header file and the parser source code file after +the Bison-generated @code{YYSTYPE}, @code{YYLTYPE}, and token definitions. +@end itemize + +@item top +@findex %code top + +@itemize @bullet +@item Language(s): C, C++ + +@item Purpose: The unqualified @code{%code} or @code{%code requires} should +usually be more appropriate than @code{%code top}. +However, occasionally it is necessary to insert code much nearer the top of the +parser source code file. +For example: + +@smallexample +%code top @{ + #define _GNU_SOURCE + #include +@} +@end smallexample + +@item Location(s): Near the top of the parser source code file. +@end itemize + +@item imports +@findex %code imports + +@itemize @bullet +@item Language(s): Java + +@item Purpose: This is the best place to write Java import directives. + +@item Location(s): The parser Java file after any Java package directive and +before any class definitions. +@end itemize +@end itemize + +(Like all the Yacc prologue alternatives, this directive is experimental. +More user feedback will help to determine whether it should become a permanent +feature.) + +@cindex Prologue +For a detailed discussion of how to use @code{%code} in place of the +traditional Yacc prologue for C/C++, see @ref{Prologue Alternatives}. +@end deffn + @deffn {Directive} %debug In the parser file, define the macro @code{YYDEBUG} to 1 if it is not already defined, so that the debugging facilities are compiled. @end deffn @xref{Tracing, ,Tracing Your Parser}. +@deffn {Directive} %define @var{variable} +@deffnx {Directive} %define @var{variable} "@var{value}" +Define a variable to adjust Bison's behavior. +The possible choices for @var{variable}, as well as their meanings, depend on +the selected target language and/or the parser skeleton (@pxref{Decl +Summary,,%language}). + +Bison will warn if a @var{variable} is defined multiple times. + +Omitting @code{"@var{value}"} is always equivalent to specifying it as +@code{""}. + +Some @var{variable}s may be used as Booleans. +In this case, Bison will complain if the variable definition does not meet one +of the following four conditions: + +@enumerate +@item @code{"@var{value}"} is @code{"true"} + +@item @code{"@var{value}"} is omitted (or is @code{""}). +This is equivalent to @code{"true"}. + +@item @code{"@var{value}"} is @code{"false"}. + +@item @var{variable} is never defined. +In this case, Bison selects a default value, which may depend on the selected +target language and/or parser skeleton. +@end enumerate + +Some of the accepted @var{variable}s are: + +@itemize @bullet +@item push_pull +@findex %define push_pull + +@itemize @bullet +@item Language(s): C (LALR(1) only) + +@item Purpose: Requests a pull parser, a push parser, or both. +@xref{Push Decl, ,A Push Parser}. + +@item Accepted Values: @code{"pull"}, @code{"push"}, @code{"both"} + +@item Default Value: @code{"pull"} +@end itemize + +@item lr.keep_unreachable_states +@findex %define lr.keep_unreachable_states + +@itemize @bullet +@item Language(s): all + +@item Purpose: Requests that Bison allow unreachable parser states to remain in +the parser tables. +Bison considers a state to be unreachable if there exists no sequence of +transitions from the start state to that state. +A state can become unreachable during conflict resolution if Bison disables a +shift action leading to it from a predecessor state. +Keeping unreachable states is sometimes useful for analysis purposes, but they +are useless in the generated parser. + +@item Accepted Values: Boolean + +@item Default Value: @code{"false"} + +@item Caveats: + +@itemize @bullet +@item Unreachable states may contain conflicts and may reduce rules not +reduced in any other state. +Thus, keeping unreachable states may induce warnings that are irrelevant to +your parser's behavior, and it may eliminate warnings that are relevant. +Of course, the change in warnings may actually be relevant to a parser table +analysis that wants to keep unreachable states, so this behavior will likely +remain in future Bison releases. + +@item While Bison is able to remove unreachable states, it is not guaranteed to +remove other kinds of useless states. +Specifically, when Bison disables reduce actions during conflict resolution, +some goto actions may become useless, and thus some additional states may +become useless. +If Bison were to compute which goto actions were useless and then disable those +actions, it could identify such states as unreachable and then remove those +states. +However, Bison does not compute which goto actions are useless. +@end itemize +@end itemize + +@item namespace +@findex %define namespace + +@itemize +@item Languages(s): C++ + +@item Purpose: Specifies the namespace for the parser class. +For example, if you specify: + +@smallexample +%define namespace "foo::bar" +@end smallexample + +Bison uses @code{foo::bar} verbatim in references such as: + +@smallexample +foo::bar::parser::semantic_type +@end smallexample + +However, to open a namespace, Bison removes any leading @code{::} and then +splits on any remaining occurrences: + +@smallexample +namespace foo @{ namespace bar @{ + class position; + class location; +@} @} +@end smallexample + +@item Accepted Values: Any absolute or relative C++ namespace reference without +a trailing @code{"::"}. +For example, @code{"foo"} or @code{"::foo::bar"}. + +@item Default Value: The value specified by @code{%name-prefix}, which defaults +to @code{yy}. +This usage of @code{%name-prefix} is for backward compatibility and can be +confusing since @code{%name-prefix} also specifies the textual prefix for the +lexical analyzer function. +Thus, if you specify @code{%name-prefix}, it is best to also specify +@code{%define namespace} so that @code{%name-prefix} @emph{only} affects the +lexical analyzer function. +For example, if you specify: + +@smallexample +%define namespace "foo" +%name-prefix "bar::" +@end smallexample + +The parser namespace is @code{foo} and @code{yylex} is referenced as +@code{bar::lex}. +@end itemize +@end itemize + +@end deffn + @deffn {Directive} %defines Write a header file containing macro definitions for the token type names defined in the grammar as well as a few other declarations. @@ -4612,7 +4985,7 @@ Tokens}. @findex %code provides If you have declared @code{%code requires} or @code{%code provides}, the output header also contains their code. -@xref{Table of Symbols, ,%code}. +@xref{Decl Summary, ,%code}. @end deffn @deffn {Directive} %defines @var{defines-file} @@ -4649,10 +5022,13 @@ Rename the external symbols used in the parser so that they start with in C parsers is @code{yyparse}, @code{yylex}, @code{yyerror}, @code{yynerrs}, @code{yylval}, @code{yychar}, @code{yydebug}, and -(if locations are used) @code{yylloc}. For example, if you use -@samp{%name-prefix "c_"}, the names become @code{c_parse}, @code{c_lex}, -and so on. In C++ parsers, it is only the surrounding namespace which is -named @var{prefix} instead of @samp{yy}. +(if locations are used) @code{yylloc}. If you use a push parser, +@code{yypush_parse}, @code{yypull_parse}, @code{yypstate}, +@code{yypstate_new} and @code{yypstate_delete} will +also be renamed. For example, if you use @samp{%name-prefix "c_"}, the +names become @code{c_parse}, @code{c_lex}, and so on. +For C++ parsers, see the @code{%define namespace} documentation in this +section. @xref{Multiple Parsers, ,Multiple Parsers in the Same Program}. @end deffn @@ -4664,16 +5040,6 @@ Precedence}). @end deffn @end ifset -@deffn {Directive} %no-parser -Do not include any C code in the parser file; generate tables only. The -parser file contains just @code{#define} directives and static variable -declarations. - -This option also tells Bison to write the C code for the grammar actions -into a file named @file{@var{file}.act}, in the form of a -brace-surrounded body fit for a @code{switch} statement. -@end deffn - @deffn {Directive} %no-lines Don't generate any @code{#line} preprocessor commands in the parser file. Ordinarily Bison writes these commands in the parser file so that @@ -4698,11 +5064,18 @@ Require a Version of Bison}. @end deffn @deffn {Directive} %skeleton "@var{file}" -Specify the skeleton to use. You probably don't need this option unless -you are developing Bison; you should use @code{%language} if you want to -specify the skeleton for a different language, because it is clearer and -because it will always choose the correct skeleton for non-deterministic -or push parsers. +Specify the skeleton to use. + +You probably don't need this option unless you are developing Bison. +You should use @code{%language} if you want to specify the skeleton for a +different language, because it is clearer and because it will always choose the +correct skeleton for non-deterministic or push parsers. + +If @var{file} does not contain a @code{/}, @var{file} is the name of a skeleton +file in the Bison installation directory. +If it does, @var{file} is an absolute file name or a file name relative to the +directory of the grammar file. +This is similar to how most shells resolve commands. @end deffn @deffn {Directive} %token-table @@ -4768,8 +5141,11 @@ names that do not conflict. The precise list of symbols renamed is @code{yyparse}, @code{yylex}, @code{yyerror}, @code{yynerrs}, @code{yylval}, @code{yylloc}, -@code{yychar} and @code{yydebug}. For example, if you use @samp{-p c}, -the names become @code{cparse}, @code{clex}, and so on. +@code{yychar} and @code{yydebug}. If you use a push parser, +@code{yypush_parse}, @code{yypull_parse}, @code{yypstate}, +@code{yypstate_new} and @code{yypstate_delete} will also be renamed. +For example, if you use @samp{-p c}, the names become @code{cparse}, +@code{clex}, and so on. @strong{All the other variables and macros associated with Bison are not renamed.} These others are not global; there is no conflict if the same @@ -4798,6 +5174,12 @@ in the grammar file, you are likely to run into trouble. @menu * Parser Function:: How to call @code{yyparse} and what it returns. +* Push Parser Function:: How to call @code{yypush_parse} and what it returns. +* Pull Parser Function:: How to call @code{yypull_parse} and what it returns. +* Parser Create Function:: How to call @code{yypstate_new} and what it + returns. +* Parser Delete Function:: How to call @code{yypstate_delete} and what it + returns. * Lexical:: You must supply a function @code{yylex} which reads tokens. * Error Reporting:: You must supply a function @code{yyerror}. @@ -4880,6 +5262,61 @@ In the grammar actions, use expressions like this to refer to the data: exp: @dots{} @{ @dots{}; *randomness += 1; @dots{} @} @end example +@node Push Parser Function +@section The Push Parser Function @code{yypush_parse} +@findex yypush_parse + +You call the function @code{yypush_parse} to parse a single token. This +function is available if either the @code{%define push_pull "push"} or +@code{%define push_pull "both"} declaration is used. +@xref{Push Decl, ,A Push Parser}. + +@deftypefun int yypush_parse (yypstate *yyps) +The value returned by @code{yypush_parse} is the same as for yyparse with the +following exception. @code{yypush_parse} will return YYPUSH_MORE if more input +is required to finish parsing the grammar. +@end deftypefun + +@node Pull Parser Function +@section The Pull Parser Function @code{yypull_parse} +@findex yypull_parse + +You call the function @code{yypull_parse} to parse the rest of the input +stream. This function is available if the @code{%define push_pull "both"} +declaration is used. +@xref{Push Decl, ,A Push Parser}. + +@deftypefun int yypull_parse (yypstate *yyps) +The value returned by @code{yypull_parse} is the same as for @code{yyparse}. +@end deftypefun + +@node Parser Create Function +@section The Parser Create Function @code{yystate_new} +@findex yypstate_new + +You call the function @code{yypstate_new} to create a new parser instance. +This function is available if either the @code{%define push_pull "push"} or +@code{%define push_pull "both"} declaration is used. +@xref{Push Decl, ,A Push Parser}. + +@deftypefun yypstate *yypstate_new (void) +The fuction will return a valid parser instance if there was memory available +or NULL if no memory was available. +@end deftypefun + +@node Parser Delete Function +@section The Parser Delete Function @code{yystate_delete} +@findex yypstate_delete + +You call the function @code{yypstate_delete} to delete a parser instance. +function is available if either the @code{%define push_pull "push"} or +@code{%define push_pull "both"} declaration is used. +@xref{Push Decl, ,A Push Parser}. + +@deftypefun void yypstate_delete (yypstate *yyps) +This function will reclaim the memory associated with a parser instance. +After this call, you should no longer attempt to use the parser instance. +@end deftypefun @node Lexical @section The Lexical Analyzer Function @code{yylex} @@ -6992,7 +7429,7 @@ with some set of possible lookahead tokens. When run with @example state 8 - exp -> exp . '+' exp [$, '+', '-', '/'] (rule 1) + exp -> exp . '+' exp (rule 1) exp -> exp '+' exp . [$, '+', '-', '/'] (rule 1) exp -> exp . '-' exp (rule 2) exp -> exp . '*' exp (rule 3) @@ -7101,7 +7538,7 @@ always possible. The trace facility outputs messages with macro calls of the form @code{YYFPRINTF (stderr, @var{format}, @var{args})} where -@var{format} and @var{args} are the usual @code{printf} format and +@var{format} and @var{args} are the usual @code{printf} format and variadic arguments. If you define @code{YYDEBUG} to a nonzero value but do not define @code{YYFPRINTF}, @code{} is automatically included and @code{YYFPRINTF} is defined to @code{fprintf}. @@ -7254,6 +7691,9 @@ Print the version number of Bison and exit. @item --print-localedir Print the name of the directory containing locale-dependent data. +@item --print-datadir +Print the name of the directory containing skeletons and XSLT. + @item -y @itemx --yacc Act more like the traditional Yacc command. This can cause @@ -7313,20 +7753,22 @@ and debuggers will associate errors with your source file, the grammar file. This option causes them to associate errors with the parser file, treating it as an independent source file in its own right. -@item -n -@itemx --no-parser -Pretend that @code{%no-parser} was specified. @xref{Decl Summary}. - @item -S @var{file} @itemx --skeleton=@var{file} -Specify the skeleton to use, as if @code{%skeleton} was specified +Specify the skeleton to use, similar to @code{%skeleton} (@pxref{Decl Summary, , Bison Declaration Summary}). -You probably don't need this option unless you are developing Bison; -you should use @option{--language} if you want to specify the skeleton for a +You probably don't need this option unless you are developing Bison. +You should use @option{--language} if you want to specify the skeleton for a different language, because it is clearer and because it will always choose the correct skeleton for non-deterministic or push parsers. +If @var{file} does not contain a @code{/}, @var{file} is the name of a skeleton +file in the Bison installation directory. +If it does, @var{file} is an absolute file name or a file name relative to the +current working directory. +This is similar to how most shells resolve commands. + @item -k @itemx --token-table Pretend that @code{%token-table} was specified. @xref{Decl Summary}. @@ -7411,9 +7853,9 @@ the corresponding short option. @item @option{--help} @tab @option{-h} @item @option{--name-prefix=@var{prefix}} @tab @option{-p @var{name-prefix}} @item @option{--no-lines} @tab @option{-l} -@item @option{--no-parser} @tab @option{-n} @item @option{--output=@var{outfile}} @tab @option{-o @var{outfile}} @item @option{--print-localedir} @tab +@item @option{--print-datadir} @tab @item @option{--token-table} @tab @option{-k} @item @option{--verbose} @tab @option{-v} @item @option{--version} @tab @option{-V} @@ -7448,12 +7890,12 @@ int yyparse (void); @c ================================================= C++ Bison -@node C++ Language Interface -@chapter C++ Language Interface +@node Other Languages +@chapter Parsers Written In Other Languages @menu * C++ Parsers:: The interface to generate C++ parser classes -* A Complete C++ Example:: Demonstrating their use +* Java Parsers:: The interface to generate Java parser classes @end menu @node C++ Parsers @@ -7465,6 +7907,7 @@ int yyparse (void); * C++ Location Values:: The position and location classes * C++ Parser Interface:: Instantiating and running the parser * C++ Scanner Interface:: Exchanges between yylex and parse +* A Complete C++ Example:: Demonstrating their use @end menu @node C++ Bison Interface @@ -7478,10 +7921,12 @@ The C++ @acronym{LALR}(1) parser is selected using the language directive, @option{--language=c++}. @xref{Decl Summary}. -When run, @command{bison} will create several -entities in the @samp{yy} namespace. Use the @samp{%name-prefix} -directive to change the namespace name, see @ref{Decl Summary}. The -various classes are generated in the following files: +When run, @command{bison} will create several entities in the @samp{yy} +namespace. +@findex %define namespace +Use the @samp{%define namespace} directive to change the namespace name, see +@ref{Decl Summary}. +The various classes are generated in the following files: @table @file @item position.hh @@ -7673,7 +8118,7 @@ value and location being @var{yylval} and @var{yylloc}. Invocations of @node A Complete C++ Example -@section A Complete C++ Example +@subsection A Complete C++ Example This section demonstrates the use of a C++ parser with a simple but complete example. This example should be available on your system, @@ -7693,7 +8138,7 @@ actually easier to interface with. @end menu @node Calc++ --- C++ Calculator -@subsection Calc++ --- C++ Calculator +@subsubsection Calc++ --- C++ Calculator Of course the grammar is dedicated to arithmetics, a single expression, possibly preceded by variable assignments. An @@ -7708,7 +8153,7 @@ seven * seven @end example @node Calc++ Parsing Driver -@subsection Calc++ Parsing Driver +@subsubsection Calc++ Parsing Driver @c - An env @c - A place to store error messages @c - A place for the result @@ -7857,7 +8302,7 @@ calcxx_driver::error (const std::string& m) @end example @node Calc++ Parser -@subsection Calc++ Parser +@subsubsection Calc++ Parser The parser definition file @file{calc++-parser.yy} starts by asking for the C++ LALR(1) skeleton, the creation of the parser header file, and @@ -7881,7 +8326,7 @@ reciprocally, both cannot include the header of the other. Because the driver's header needs detailed knowledge about the parser class (in particular its inner types), it is the parser's header which will simply use a forward declaration of the driver. -@xref{Table of Symbols, ,%code}. +@xref{Decl Summary, ,%code}. @comment file: calc++-parser.yy @example @@ -7969,7 +8414,7 @@ avoid name clashes. %token ASSIGN ":=" %token IDENTIFIER "identifier" %token NUMBER "number" -%type exp "expression" +%type exp @end example @noindent @@ -7982,7 +8427,7 @@ To enable memory deallocation during error recovery, use %printer @{ debug_stream () << *$$; @} "identifier" %destructor @{ delete $$; @} "identifier" -%printer @{ debug_stream () << $$; @} "number" "expression" +%printer @{ debug_stream () << $$; @} @end example @noindent @@ -8027,7 +8472,7 @@ yy::calcxx_parser::error (const yy::calcxx_parser::location_type& l, @end example @node Calc++ Scanner -@subsection Calc++ Scanner +@subsubsection Calc++ Scanner The Flex scanner first includes the driver declaration, then the parser's to get the set of defined tokens. @@ -8153,7 +8598,7 @@ calcxx_driver::scan_end () @end example @node Calc++ Top Level -@subsection Calc++ Top Level +@subsubsection Calc++ Top Level The top level file, @file{calc++.cc}, poses no problem. @@ -8176,6 +8621,327 @@ main (int argc, char *argv[]) @} @end example +@node Java Parsers +@section Java Parsers + +@menu +* Java Bison Interface:: Asking for Java parser generation +* Java Semantic Values:: %type and %token vs. Java +* Java Location Values:: The position and location classes +* Java Parser Interface:: Instantiating and running the parser +* Java Scanner Interface:: Java scanners, and pure parsers +* Java Differences:: Differences between C/C++ and Java Grammars +@end menu + +@node Java Bison Interface +@subsection Java Bison Interface +@c - %language "Java" +@c - initial action + +The Java parser skeletons are selected using a language directive, +@samp{%language "Java"}, or the synonymous command-line option +@option{--language=java}. + +When run, @command{bison} will create several entities whose name +starts with @samp{YY}. Use the @samp{%name-prefix} directive to +change the prefix, see @ref{Decl Summary}; classes can be placed +in an arbitrary Java package using a @samp{%define package} section. + +The parser class defines an inner class, @code{Location}, that is used +for location tracking. If the parser is pure, it also defines an +inner interface, @code{Lexer}; see~@ref{Java Scanner Interface} for the +meaning of pure parsers when the Java language is chosen. Other than +these inner class/interface, and the members described in~@ref{Java +Parser Interface}, all the other members and fields are preceded +with a @code{yy} prefix to avoid clashes with user code. + +No header file can be generated for Java parsers; you must not pass +@option{-d}/@option{--defines} to @command{bison}, nor use the +@samp{%defines} directive. + +By default, the @samp{YYParser} class has package visibility. A +declaration @samp{%define "public"} will change to public visibility. +Remember that, according to the Java language specification, the name +of the @file{.java} file should match the name of the class in this +case. + +Similarly, a declaration @samp{%define "abstract"} will make your +class abstract. + +You can create documentation for generated parsers using Javadoc. + +@node Java Semantic Values +@subsection Java Semantic Values +@c - No %union, specify type in %type/%token. +@c - YYSTYPE +@c - Printer and destructor + +There is no @code{%union} directive in Java parsers. Instead, the +semantic values' types (class names) should be specified in the +@code{%type} or @code{%token} directive: + +@example +%type expr assignment_expr term factor +%type number +@end example + +By default, the semantic stack is declared to have @code{Object} members, +which means that the class types you specify can be of any class. +To improve the type safety of the parser, you can declare the common +superclass of all the semantic values using the @samp{%define} directive. +For example, after the following declaration: + +@example +%define "stype" "ASTNode" +@end example + +@noindent +any @code{%type} or @code{%token} specifying a semantic type which +is not a subclass of ASTNode, will cause a compile-time error. + +Types used in the directives may be qualified with a package name. +Primitive data types are accepted for Java version 1.5 or later. Note +that in this case the autoboxing feature of Java 1.5 will be used. + +Java parsers do not support @code{%destructor}, since the language +adopts garbage collection. The parser will try to hold references +to semantic values for as little time as needed. + +Java parsers do not support @code{%printer}, as @code{toString()} +can be used to print the semantic values. This however may change +(in a backwards-compatible way) in future versions of Bison. + + +@node Java Location Values +@subsection Java Location Values +@c - %locations +@c - class Position +@c - class Location + +When the directive @code{%locations} is used, the Java parser +supports location tracking, see @ref{Locations, , Locations Overview}. +An auxiliary user-defined class defines a @dfn{position}, a single point +in a file; Bison itself defines a class representing a @dfn{location}, +a range composed of a pair of positions (possibly spanning several +files). The location class is an inner class of the parser; the name +is @code{Location} by default, may also be renamed using @code{%define +"location_type" "@var{class-name}}. + +The location class treats the position as a completely opaque value. +By default, the class name is @code{Position}, but this can be changed +with @code{%define "position_type" "@var{class-name}"}. + + +@deftypemethod {Location} {Position} begin +@deftypemethodx {Location} {Position} end +The first, inclusive, position of the range, and the first beyond. +@end deftypemethod + +@deftypemethod {Location} {void} toString () +Prints the range represented by the location. For this to work +properly, the position class should override the @code{equals} and +@code{toString} methods appropriately. +@end deftypemethod + + +@node Java Parser Interface +@subsection Java Parser Interface +@c - define parser_class_name +@c - Ctor +@c - parse, error, set_debug_level, debug_level, set_debug_stream, +@c debug_stream. +@c - Reporting errors + +The output file defines the parser class in the package optionally +indicated in the @code{%define package} section. The class name defaults +to @code{YYParser}. The @code{YY} prefix may be changed using +@samp{%name-prefix}; alternatively, you can use @samp{%define +"parser_class_name" "@var{name}"} to give a custom name to the class. +The interface of this class is detailed below. It can be extended using +the @code{%parse-param} directive; each occurrence of the directive will +add a field to the parser class, and an argument to its constructor. + +@deftypemethod {YYParser} {} YYParser (@var{type1} @var{arg1}, ...) +Build a new parser object. There are no arguments by default, unless +@samp{%parse-param @{@var{type1} @var{arg1}@}} was used. +@end deftypemethod + +@deftypemethod {YYParser} {boolean} parse () +Run the syntactic analysis, and return @code{true} on success, +@code{false} otherwise. +@end deftypemethod + +@deftypemethod {YYParser} {boolean} recovering () +During the syntactic analysis, return @code{true} if recovering +from a syntax error. @xref{Error Recovery}. +@end deftypemethod + +@deftypemethod {YYParser} {java.io.PrintStream} getDebugStream () +@deftypemethodx {YYParser} {void} setDebugStream (java.io.printStream @var{o}) +Get or set the stream used for tracing the parsing. It defaults to +@code{System.err}. +@end deftypemethod + +@deftypemethod {YYParser} {int} getDebugLevel () +@deftypemethodx {YYParser} {void} setDebugLevel (int @var{l}) +Get or set the tracing level. Currently its value is either 0, no trace, +or nonzero, full tracing. +@end deftypemethod + +@deftypemethod {YYParser} {void} error (Location @var{l}, String @var{m}) +The definition for this member function must be supplied by the user +in the same way as the scanner interface (@pxref{Java Scanner +Interface}); the parser uses it to report a parser error occurring at +@var{l}, described by @var{m}. +@end deftypemethod + + +@node Java Scanner Interface +@subsection Java Scanner Interface +@c - %code lexer +@c - %lex-param +@c - Lexer interface + +Contrary to C parsers, Java parsers do not use global variables; the +state of the parser is always local to an instance of the parser class. +Therefore, all Java parsers are ``pure'', and the @code{%pure-parser} +directive does not do anything when used in Java. + +The scanner always resides in a separate class than the parser. +Still, Java also two possible ways to interface a Bison-generated Java +parser with a scanner, that is, the scanner may reside in a separate file +than the Bison grammar, or in the same file. The interface +to the scanner is similar in the two cases. + +In the first case, where the scanner in the same file as the grammar, the +scanner code has to be placed in @code{%code lexer} blocks. If you want +to pass parameters from the parser constructor to the scanner constructor, +specify them with @code{%lex-param}; they are passed before +@code{%parse-param}s to the constructor. + +In the second case, the scanner has to implement interface @code{Lexer}, +which is defined within the parser class (e.g., @code{YYParser.Lexer}). +The constructor of the parser object will then accept an object +implementing the interface; @code{%lex-param} is not used in this +case. + +In both cases, the scanner has to implement the following methods. + +@deftypemethod {Lexer} {void} yyerror (Location @var{l}, String @var{m}) +As explained in @pxref{Java Parser Interface}, this method is defined +by the user to emit an error message. The first parameter is omitted +if location tracking is not active. Its type can be changed using +@samp{%define "location_type" "@var{class-name}".} +@end deftypemethod + +@deftypemethod {Lexer} {int} yylex (@var{type1} @var{arg1}, ...) +Return the next token. Its type is the return value, its semantic +value and location are saved and returned by the ther methods in the +interface. Invocations of @samp{%lex-param @{@var{type1} +@var{arg1}@}} yield additional arguments. +@end deftypemethod + +@deftypemethod {Lexer} {Position} getStartPos () +@deftypemethodx {Lexer} {Position} getEndPos () +Return respectively the first position of the last token that +@code{yylex} returned, and the first position beyond it. These +methods are not needed unless location tracking is active. + +The return type can be changed using @samp{%define "position_type" +"@var{class-name}".} +@end deftypemethod + +@deftypemethod {Lexer} {Object} getLVal () +Return respectively the first position of the last token that yylex +returned, and the first position beyond it. + +The return type can be changed using @samp{%define "stype" +"@var{class-name}".} +@end deftypemethod + + +If @code{%pure-parser} is not specified, the lexer interface +resides in the same class (@code{YYParser}) as the Bison-generated +parser. The fields and methods that are provided to +this end are as follows. + +@deftypemethod {YYParser} {void} error (Location @var{l}, String @var{m}) +As explained in @pxref{Java Parser Interface}, this method is defined +by the user to emit an error message. The first parameter is not used +unless location tracking is active. Its type can be changed using +@samp{%define "location_type" "@var{class-name}".} +@end deftypemethod + +@deftypemethod {YYParser} {int} yylex (@var{type1} @var{arg1}, ...) +Return the next token. Its type is the return value, its semantic +value and location are saved into @code{yylval}, @code{yystartpos}, +@code{yyendpos}. Invocations of @samp{%lex-param @{@var{type1} +@var{arg1}@}} yield additional arguments. +@end deftypemethod + +@deftypecv {Field} {YYParser} Position yystartpos +@deftypecvx {Field} {YYParser} Position yyendpos +Contain respectively the first position of the last token that yylex +returned, and the first position beyond it. These methods are not +needed unless location tracking is active. + +The field's type can be changed using @samp{%define "position_type" +"@var{class-name}".} +@end deftypecv + +@deftypecv {Field} {YYParser} Object yylval +Return respectively the first position of the last token that yylex +returned, and the first position beyond it. + +The field's type can be changed using @samp{%define "stype" +"@var{class-name}".} +@end deftypecv + +@node Java Differences +@subsection Differences between C/C++ and Java Grammars + +The different structure of the Java language forces several differences +between C/C++ grammars, and grammars designed for Java parsers. This +section summarizes these differences. + +@itemize +@item +Java lacks a preprocessor, so the @code{YYERROR}, @code{YYACCEPT}, +@code{YYABORT} symbols (@pxref{Table of Symbols}) cannot obviously be +macros. Instead, they should be preceded by @code{return} when they +appear in an action. The actual definition of these symbols is +opaque to the Bison grammar, and it might change in the future. The +only meaningful operation that you can do, is to return them. + +Note that of these three symbols, only @code{YYACCEPT} and +@code{YYABORT} will cause a return from the @code{yyparse} +method@footnote{Java parsers include the actions in a separate +method than @code{yyparse} in order to have an intuitive syntax that +corresponds to these C macros.}. + +@item +The prolog declarations have a different meaning than in C/C++ code. +@table @asis +@item @code{%code imports} +blocks are placed at the beginning of the Java source code. They may +include copyright notices. For a @code{package} declarations, it is +suggested to use @code{%define package} instead. + +@item unqualified @code{%code} +blocks are placed inside the parser class. + +@item @code{%code lexer} +blocks, if specified, should include the implementation of the +scanner. If there is no such block, the scanner can be any class +that implements the appropriate interface (see @pxref{Java Scanner +Interface}). +@end table + +Other @code{%code} blocks are not supported in Java parsers. +The epilogue has the same meaning as in C/C++ code and it can +be used to define other classes used by the parser. +@end itemize + @c ================================================= FAQ @node FAQ @@ -8196,7 +8962,7 @@ are addressed. * I can't build Bison:: Troubleshooting * Where can I find help?:: Troubleshouting * Bug Reports:: Troublereporting -* Other Languages:: Parsers in Java and others +* More Languages:: Parsers in C++, Java, and so on * Beta Testing:: Experimenting development versions * Mailing Lists:: Meeting other Bison users @end menu @@ -8519,15 +9285,15 @@ send a bug report just because you can not provide a fix. Send bug reports to @email{bug-bison@@gnu.org}. -@node Other Languages -@section Other Languages +@node More Languages +@section More Languages @display -Will Bison ever have C++ support? How about Java or @var{insert your +Will Bison ever have C++ and Java support? How about @var{insert your favorite language here}? @end display -C++ support is there now, and is documented. We'd love to add other +C++ and Java support is there now, and is documented. We'd love to add other languages; contributions are welcome. @node Beta Testing @@ -8647,109 +9413,9 @@ Start-Symbol}. It cannot be used in the grammar. @end deffn @deffn {Directive} %code @{@var{code}@} -@findex %code -This is the unqualified form of the @code{%code} directive. -It inserts @var{code} verbatim at the default location in the output. -That default location is determined by the selected target language and/or -parser skeleton. - -@cindex Prologue -For the current C/C++ skeletons, the default location is the parser source code -file after the usual contents of the parser header file. -Thus, @code{%code} replaces the traditional Yacc prologue, -@code{%@{@var{code}%@}}, for most purposes. -For a detailed discussion, see @ref{Prologue Alternatives}. - -@comment For Java, the default location is inside the parser class. - -(Like all the Yacc prologue alternatives, this directive is experimental. -More user feedback will help to determine whether it should become a permanent -feature.) -@end deffn - -@deffn {Directive} %code @var{qualifier} @{@var{code}@} -This is the qualified form of the @code{%code} directive. -If you need to specify location-sensitive verbatim @var{code} that does not -belong at the default location selected by the unqualified @code{%code} form, -use this form instead. - -@var{qualifier} identifies the purpose of @var{code} and thus the location(s) -where Bison should generate it. -Not all values of @var{qualifier} are available for all target languages: - -@itemize @bullet -@findex %code requires -@item requires - -@itemize @bullet -@item Language(s): C, C++ - -@item Purpose: This is the best place to write dependency code required for -@code{YYSTYPE} and @code{YYLTYPE}. -In other words, it's the best place to define types referenced in @code{%union} -directives, and it's the best place to override Bison's default @code{YYSTYPE} -and @code{YYLTYPE} definitions. - -@item Location(s): The parser header file and the parser source code file -before the Bison-generated @code{YYSTYPE} and @code{YYLTYPE} definitions. -@end itemize - -@item provides -@findex %code provides - -@itemize @bullet -@item Language(s): C, C++ - -@item Purpose: This is the best place to write additional definitions and -declarations that should be provided to other modules. - -@item Location(s): The parser header file and the parser source code file after -the Bison-generated @code{YYSTYPE}, @code{YYLTYPE}, and token definitions. -@end itemize - -@item top -@findex %code top - -@itemize @bullet -@item Language(s): C, C++ - -@item Purpose: The unqualified @code{%code} or @code{%code requires} should -usually be more appropriate than @code{%code top}. -However, occasionally it is necessary to insert code much nearer the top of the -parser source code file. -For example: - -@smallexample -%code top @{ - #define _GNU_SOURCE - #include -@} -@end smallexample - -@item Location(s): Near the top of the parser source code file. -@end itemize -@ignore -@item imports -@findex %code imports - -@itemize @bullet -@item Language(s): Java - -@item Purpose: This is the best place to write Java import directives. - -@item Location(s): The parser Java file after any Java package directive and -before any class definitions. -@end itemize -@end ignore -@end itemize - -(Like all the Yacc prologue alternatives, this directive is experimental. -More user feedback will help to determine whether it should become a permanent -feature.) - -@cindex Prologue -For a detailed discussion of how to use @code{%code} in place of the -traditional Yacc prologue for C/C++, see @ref{Prologue Alternatives}. +@deffnx {Directive} %code @var{qualifier} @{@var{code}@} +Insert @var{code} verbatim into output parser source. +@xref{Decl Summary,,%code}. @end deffn @deffn {Directive} %debug @@ -8768,6 +9434,12 @@ Precedence}. @end deffn @end ifset +@deffn {Directive} %define @var{define-variable} +@deffnx {Directive} %define @var{define-variable} @var{value} +Define a variable to adjust Bison's behavior. +@xref{Decl Summary,,%define}. +@end deffn + @deffn {Directive} %defines Bison declaration to create a header file meant for the scanner. @xref{Decl Summary}. @@ -8941,12 +9613,18 @@ Macro to pretend that an unrecoverable syntax error has occurred, by making @code{yyparse} return 1 immediately. The error reporting function @code{yyerror} is not called. @xref{Parser Function, ,The Parser Function @code{yyparse}}. + +For Java parsers, this functionality is invoked using @code{return YYABORT;} +instead. @end deffn @deffn {Macro} YYACCEPT Macro to pretend that a complete utterance of the language has been read, by making @code{yyparse} return 0 immediately. @xref{Parser Function, ,The Parser Function @code{yyparse}}. + +For Java parsers, this functionality is invoked using @code{return YYACCEPT;} +instead. @end deffn @deffn {Macro} YYBACKUP @@ -8987,6 +9665,9 @@ Macro to pretend that a syntax error has just been detected: call @code{yyerror} and then perform normal error recovery if possible (@pxref{Error Recovery}), or (if recovery is impossible) make @code{yyparse} return 1. @xref{Error Recovery}. + +For Java parsers, this functionality is invoked using @code{return YYERROR;} +instead. @end deffn @deffn {Function} yyerror @@ -9055,7 +9736,8 @@ Management}. @deffn {Variable} yynerrs Global variable which Bison increments each time it reports a syntax error. -(In a pure parser, it is a local variable within @code{yyparse}.) +(In a pure parser, it is a local variable within @code{yyparse}. In a +pure push parser, it is a member of yypstate.) @xref{Error Reporting, ,The Error Reporting Function @code{yyerror}}. @end deffn @@ -9064,6 +9746,33 @@ The parser function produced by Bison; call this function to start parsing. @xref{Parser Function, ,The Parser Function @code{yyparse}}. @end deffn +@deffn {Function} yypstate_delete +The function to delete a parser instance, produced by Bison in push mode; +call this function to delete the memory associated with a parser. +@xref{Parser Delete Function, ,The Parser Delete Function +@code{yypstate_delete}}. +@end deffn + +@deffn {Function} yypstate_new +The function to create a parser instance, produced by Bison in push mode; +call this function to create a new parser. +@xref{Parser Create Function, ,The Parser Create Function +@code{yypstate_new}}. +@end deffn + +@deffn {Function} yypull_parse +The parser function produced by Bison in push mode; call this function to +parse the rest of the input stream. +@xref{Pull Parser Function, ,The Pull Parser Function +@code{yypull_parse}}. +@end deffn + +@deffn {Function} yypush_parse +The parser function produced by Bison in push mode; call this function to +parse a single token. @xref{Push Parser Function, ,The Push Parser Function +@code{yypush_parse}}. +@end deffn + @deffn {Macro} YYPARSE_PARAM An obsolete macro for specifying the name of a parameter that @code{yyparse} should accept. The use of this macro is deprecated, and @@ -9272,11 +9981,6 @@ grammatically indivisible. The piece of text it represents is a token. @node Copying This Manual @appendix Copying This Manual - -@menu -* GNU Free Documentation License:: License for copying this manual. -@end menu - @include fdl.texi @node Index