X-Git-Url: https://git.saurik.com/bison.git/blobdiff_plain/8a2800e787cfb3e8f59480b8f50ba204e01f9071..98ae96438ebb4465c777a7849f1b4ca222e760e3:/doc/bison.texinfo?ds=sidebyside diff --git a/doc/bison.texinfo b/doc/bison.texinfo index 4cc32bea..01dccb41 100644 --- a/doc/bison.texinfo +++ b/doc/bison.texinfo @@ -145,9 +145,9 @@ The Concepts of Bison Writing @acronym{GLR} Parsers -* Simple GLR Parsers:: Using @acronym{GLR} parsers on unambiguous grammars -* Merging GLR Parses:: Using @acronym{GLR} parsers to resolve ambiguities -* Compiler Requirements:: @acronym{GLR} parsers require a modern C compiler +* Simple GLR Parsers:: Using @acronym{GLR} parsers on unambiguous grammars +* Merging GLR Parses:: Using @acronym{GLR} parsers to resolve ambiguities +* Compiler Requirements:: @acronym{GLR} parsers require a modern C compiler Examples @@ -225,6 +225,7 @@ Tracking Locations Bison Declarations +* Require Decl:: Requiring a Bison version. * Token Decl:: Declaring terminal symbols. * Precedence Decl:: Declaring terminals with precedence and associativity. * Union Decl:: Declaring the set of all semantic value types. @@ -732,9 +733,9 @@ user-defined function on the resulting values to produce an arbitrary merged result. @menu -* Simple GLR Parsers:: Using @acronym{GLR} parsers on unambiguous grammars -* Merging GLR Parses:: Using @acronym{GLR} parsers to resolve ambiguities -* Compiler Requirements:: @acronym{GLR} parsers require a modern C compiler +* Simple GLR Parsers:: Using @acronym{GLR} parsers on unambiguous grammars +* Merging GLR Parses:: Using @acronym{GLR} parsers to resolve ambiguities +* Compiler Requirements:: @acronym{GLR} parsers require a modern C compiler @end menu @node Simple GLR Parsers @@ -1197,11 +1198,13 @@ function @code{yyerror} and the parser function @code{yyparse} itself. This also includes numerous identifiers used for internal purposes. Therefore, you should avoid using C identifiers starting with @samp{yy} or @samp{YY} in the Bison grammar file except for the ones defined in -this manual. +this manual. Also, you should avoid using the C identifiers +@samp{malloc} and @samp{free} for anything other than their usual +meanings. In some cases the Bison parser file includes system headers, and in those cases your code should respect the identifiers reserved by those -headers. On some non-@acronym{GNU} hosts, @code{}, +headers. On some non-@acronym{GNU} hosts, @code{}, @code{}, @code{}, and @code{} are included as needed to declare memory allocators and related types. @code{} is included if message translation is in use @@ -3546,6 +3549,7 @@ it explicitly (@pxref{Language and Grammar, ,Languages and Context-Free Grammars}). @menu +* Require Decl:: Requiring a Bison version. * Token Decl:: Declaring terminal symbols. * Precedence Decl:: Declaring terminals with precedence and associativity. * Union Decl:: Declaring the set of all semantic value types. @@ -3558,6 +3562,20 @@ Grammars}). * Decl Summary:: Table of all Bison declarations. @end menu +@node Require Decl +@subsection Require a Version of Bison +@cindex version requirement +@cindex requiring a version of Bison +@findex %require + +You may require the minimum version of Bison to process the grammar. If +the requirement is not met, @command{bison} exits with an error (exit +status 63). + +@example +%require "@var{version}" +@end example + @node Token Decl @subsection Token Type Names @cindex declaring token type names @@ -3792,28 +3810,28 @@ For instance, if your locations use a file name, you may use @cindex freeing discarded symbols @findex %destructor -Some symbols can be discarded by the parser. During error -recovery (@pxref{Error Recovery}), symbols already pushed -on the stack and tokens coming from the rest of the file -are discarded until the parser falls on its feet. If the parser -runs out of memory, all the symbols on the stack must be discarded. -Even if the parser succeeds, it must discard the start symbol. +Some symbols can be discarded by the parser. During error recovery +(@pxref{Error Recovery}), symbols already pushed on the stack and tokens +coming from the rest of the file are discarded until the parser falls on +its feet. If the parser runs out of memory, all the symbols on the +stack must be discarded. Even if the parser succeeds, it must discard +the start symbol. When discarded symbols convey heap based information, this memory is lost. While this behavior can be tolerable for batch parsers, such as -in traditional compilers, it is unacceptable for programs like shells -or protocol implementations that may parse and execute indefinitely. +in traditional compilers, it is unacceptable for programs like shells or +protocol implementations that may parse and execute indefinitely. The @code{%destructor} directive defines code that is called when a symbol is discarded. @deffn {Directive} %destructor @{ @var{code} @} @var{symbols} @findex %destructor -Invoke @var{code} whenever the parser discards one of the -@var{symbols}. Within @var{code}, @code{$$} designates the semantic -value associated with the discarded symbol. The additional -parser parameters are also available -(@pxref{Parser Function, , The Parser Function @code{yyparse}}). +Invoke @var{code} whenever the parser discards one of the @var{symbols}. +Within @var{code}, @code{$$} designates the semantic value associated +with the discarded symbol. The additional parser parameters are also +available (@pxref{Parser Function, , The Parser Function +@code{yyparse}}). @strong{Warning:} as of Bison 2.1, this feature is still experimental, as there has not been enough user feedback. In particular, @@ -4155,6 +4173,11 @@ Request a pure (reentrant) parser program (@pxref{Pure Decl, ,A Pure (Reentrant) Parser}). @end deffn +@deffn {Directive} %require "@var{version}" +Require version @var{version} or higher of Bison. @xref{Require Decl, , +Require a Version of Bison}. +@end deffn + @deffn {Directive} %token-table Generate an array of token names in the parser file. The name of the array is @code{yytname}; @code{yytname[@var{i}]} is the name of the @@ -6931,12 +6954,13 @@ for a complete and accurate documentation. The @code{%union} directive works as for C, see @ref{Union Decl, ,The Collection of Value Types}. In particular it produces a genuine @code{union}@footnote{In the future techniques to allow complex types -within pseudo-unions (variants) might be implemented to alleviate -these issues.}, which have a few specific features in C++. +within pseudo-unions (similar to Boost variants) might be implemented to +alleviate these issues.}, which have a few specific features in C++. @itemize @minus @item -The name @code{YYSTYPE} also denotes @samp{union YYSTYPE}. You may -forward declare it just with @samp{union YYSTYPE;}. +The type @code{YYSTYPE} is defined but its use is discouraged: rather +you should refer to the parser's encapsulated type +@code{yy::parser::semantic_type}. @item Non POD (Plain Old Data) types cannot be used. C++ forbids any instance of classes with constructors in unions: only @emph{pointers} @@ -7137,7 +7161,8 @@ transforming the simple parsing context structure into a fully blown The declaration of this driver class, @file{calc++-driver.hh}, is as follows. The first part includes the CPP guard and imports the -required standard library components. +required standard library components, and the declaration of the parser +class. @comment file: calc++-driver.hh @example @@ -7145,26 +7170,9 @@ required standard library components. # define CALCXX_DRIVER_HH # include # include +# include "calc++-parser.hh" @end example -@noindent -Then come forward declarations. Because the parser uses the parsing -driver and reciprocally, simple inclusions of header files will not -do. Because the driver's declaration is the one that will be imported -by the rest of the project, it is saner to forward declare the -parser's information here. - -@comment file: calc++-driver.hh -@example -// Forward declarations. -union YYSTYPE; -namespace yy -@{ - class location; - class calcxx_parser; -@} -class calcxx_driver; -@end example @noindent Then comes the declaration of the scanning function. Flex expects @@ -7176,7 +7184,9 @@ factor both as follows. @example // Announce to Flex the prototype we want for lexing function, ... # define YY_DECL \ - int yylex (YYSTYPE* yylval, yy::location* yylloc, calcxx_driver& driver) + int yylex (yy::calcxx_parser::semantic_type* yylval, \ + yy::calcxx_parser::location_type* yylloc, \ + calcxx_driver& driver) // ... and declare it for the parser's sake. YY_DECL; @end example @@ -7286,19 +7296,33 @@ calcxx_driver::error (const std::string& m) @node Calc++ Parser @subsection Calc++ Parser -The parser definition file @file{calc++-parser.yy} starts by asking -for the C++ skeleton, the creation of the parser header file, and -specifies the name of the parser class. It then includes the required -headers. +The parser definition file @file{calc++-parser.yy} starts by asking for +the C++ LALR(1) skeleton, the creation of the parser header file, and +specifies the name of the parser class. Because the C++ skeleton +changed several times, it is safer to require the version you designed +the grammar for. @comment file: calc++-parser.yy @example %skeleton "lalr1.cc" /* -*- C++ -*- */ -%define "parser_class_name" "calcxx_parser" +%require "2.1a" %defines +%define "parser_class_name" "calcxx_parser" +@end example + +@noindent +Then come the declarations/inclusions needed to define the +@code{%union}. Because the parser uses the parsing driver and +reciprocally, both cannot include the header of the other. Because the +driver's header needs detailed knowledge about the parser class (in +particular its inner types), it is the parser's header which will simply +use a forward declaration of the driver. + +@comment file: calc++-parser.yy +@example %@{ # include -# include "calc++-driver.hh" +class calcxx_driver; %@} @end example @@ -7354,6 +7378,19 @@ them. @}; @end example +@noindent +The code between @samp{%@{} and @samp{%@}} after the introduction of the +@samp{%union} is output in the @file{*.cc} file; it needs detailed +knowledge about the driver. + +@comment file: calc++-parser.yy +@example +%@{ +# include "calc++-driver.hh" +%@} +@end example + + @noindent The token numbered as 0 corresponds to end of file; the following line allows for nicer error messages referring to ``end of file'' instead @@ -7363,11 +7400,11 @@ avoid name clashes. @comment file: calc++-parser.yy @example -%token TOKEN_EOF 0 "end of file" -%token TOKEN_ASSIGN ":=" -%token TOKEN_IDENTIFIER "identifier" -%token TOKEN_NUMBER "number" -%type exp "expression" +%token END 0 "end of file" +%token ASSIGN ":=" +%token IDENTIFIER "identifier" +%token NUMBER "number" +%type exp "expression" @end example @noindent @@ -7394,7 +7431,7 @@ unit: assignments exp @{ driver.result = $2; @}; assignments: assignments assignment @{@} | /* Nothing. */ @{@}; -assignment: TOKEN_IDENTIFIER ":=" exp @{ driver.variables[*$1] = $3; @}; +assignment: "identifier" ":=" exp @{ driver.variables[*$1] = $3; @}; %left '+' '-'; %left '*' '/'; @@ -7402,8 +7439,8 @@ exp: exp '+' exp @{ $$ = $1 + $3; @} | exp '-' exp @{ $$ = $1 - $3; @} | exp '*' exp @{ $$ = $1 * $3; @} | exp '/' exp @{ $$ = $1 / $3; @} - | TOKEN_IDENTIFIER @{ $$ = driver.variables[*$1]; @} - | TOKEN_NUMBER @{ $$ = $1; @}; + | "identifier" @{ $$ = driver.variables[*$1]; @} + | "number" @{ $$ = $1; @}; %% @end example @@ -7483,22 +7520,28 @@ preceding tokens. Comments would be treated equally. @end example @noindent -The rules are simple, just note the use of the driver to report -errors. +The rules are simple, just note the use of the driver to report errors. +It is convenient to use a typedef to shorten +@code{yy::calcxx_parser::token::identifier} into +@code{token::identifier} for isntance. @comment file: calc++-scanner.ll @example +%@{ + typedef yy::calcxx_parser::token token; +%@} + [-+*/] return yytext[0]; -":=" return TOKEN_ASSIGN; +":=" return token::ASSIGN; @{int@} @{ errno = 0; long n = strtol (yytext, NULL, 10); if (! (INT_MIN <= n && n <= INT_MAX && errno != ERANGE)) driver.error (*yylloc, "integer is out of range"); yylval->ival = n; - return TOKEN_NUMBER; + return token::NUMBER; @} -@{id@} yylval->sval = new std::string (yytext); return TOKEN_IDENTIFIER; +@{id@} yylval->sval = new std::string (yytext); return token::IDENTIFIER; . driver.error (*yylloc, "invalid character"); %% @end example @@ -7949,6 +7992,11 @@ Bison declaration to request a pure (reentrant) parser. @xref{Pure Decl, ,A Pure (Reentrant) Parser}. @end deffn +@deffn {Directive} %require "@var{version}" +Require version @var{version} or higher of Bison. @xref{Require Decl, , +Require a Version of Bison}. +@end deffn + @deffn {Directive} %right Bison declaration to assign right associativity to token(s). @xref{Precedence Decl, ,Operator Precedence}. @@ -8127,10 +8175,7 @@ the parser will use @code{malloc} to extend its stacks. If defined to reserved for future Bison extensions. If not defined, @code{YYSTACK_USE_ALLOCA} defaults to 0. -If you define @code{YYSTACK_USE_ALLOCA} to 1, it is your -responsibility to make sure that @code{alloca} is visible, e.g., by -using @acronym{GCC} or by including @code{}. Furthermore, -in the all-too-common case where your code may run on a host with a +In the all-too-common case where your code may run on a host with a limited stack and with unreliable stack-overflow checking, you should set @code{YYMAXDEPTH} to a value that cannot possibly result in unchecked stack overflow on any of your target hosts when