X-Git-Url: https://git.saurik.com/bison.git/blobdiff_plain/110ef36a1a93c60cc83627492de40cb95aedf9b1..99c08fb6626f4aca4a7eb2e5d53dae43bc40771b:/doc/bison.texinfo diff --git a/doc/bison.texinfo b/doc/bison.texinfo index 67a43008..176cc7ff 100644 --- a/doc/bison.texinfo +++ b/doc/bison.texinfo @@ -2706,9 +2706,6 @@ feature test macros can affect the behavior of Bison-generated @findex %code requires @findex %code provides @findex %code top -(The prologue alternatives described here are experimental. -More user feedback will help to determine whether they should become permanent -features.) The functionality of @var{Prologue} sections can often be subtle and inflexible. @@ -3053,8 +3050,8 @@ A @dfn{nonterminal symbol} stands for a class of syntactically equivalent groupings. The symbol name is used in writing grammar rules. By convention, it should be all lower case. -Symbol names can contain letters, underscores, period, and (not at the -beginning) digits and dashes. Dashes in symbol names are a GNU +Symbol names can contain letters, underscores, periods, dashes, and (not +at the beginning) digits. Dashes in symbol names are a GNU extension, incompatible with @acronym{POSIX} Yacc. Terminal symbols that contain periods or dashes make little sense: since they are not valid symbols (in most programming languages) they are not exported as @@ -4569,7 +4566,7 @@ valid grammar. @subsection A Push Parser @cindex push parser @cindex push parser -@findex %define api.push_pull +@findex %define api.push-pull (The current push parsing interface is experimental and may evolve. More user feedback will help to stabilize it.) @@ -4585,10 +4582,10 @@ within a certain time period. Normally, Bison generates a pull parser. The following Bison declaration says that you want the parser to be a push -parser (@pxref{Decl Summary,,%define api.push_pull}): +parser (@pxref{Decl Summary,,%define api.push-pull}): @example -%define api.push_pull "push" +%define api.push-pull "push" @end example In almost all cases, you want to ensure that your push parser is also @@ -4599,7 +4596,7 @@ what you are doing, your declarations should look like this: @example %define api.pure -%define api.push_pull "push" +%define api.push-pull "push" @end example There is a major notable functional difference between the pure push parser @@ -4648,14 +4645,14 @@ for use by the next invocation of the @code{yypush_parse} function. Bison also supports both the push parser interface along with the pull parser interface in the same generated parser. In order to get this functionality, -you should replace the @code{%define api.push_pull "push"} declaration with the -@code{%define api.push_pull "both"} declaration. Doing this will create all of +you should replace the @code{%define api.push-pull "push"} declaration with the +@code{%define api.push-pull "both"} declaration. Doing this will create all of the symbols mentioned earlier along with the two extra symbols, @code{yyparse} and @code{yypull_parse}. @code{yyparse} can be used exactly as it normally would be used. However, the user should note that it is implemented in the generated parser by calling @code{yypull_parse}. This makes the @code{yyparse} function that is generated with the -@code{%define api.push_pull "both"} declaration slower than the normal +@code{%define api.push-pull "both"} declaration slower than the normal @code{yyparse} function. If the user calls the @code{yypull_parse} function it will parse the rest of the input stream. It is possible to @code{yypush_parse} tokens to select a subgrammar @@ -4672,8 +4669,8 @@ yypstate_delete (ps); @end example Adding the @code{%define api.pure} declaration does exactly the same thing to -the generated parser with @code{%define api.push_pull "both"} as it did for -@code{%define api.push_pull "push"}. +the generated parser with @code{%define api.push-pull "both"} as it did for +@code{%define api.push-pull "push"}. @node Decl Summary @subsection Bison Declaration Summary @@ -4753,10 +4750,6 @@ Thus, @code{%code} replaces the traditional Yacc prologue, For a detailed discussion, see @ref{Prologue Alternatives}. For Java, the default location is inside the parser class. - -(Like all the Yacc prologue alternatives, this directive is experimental. -More user feedback will help to determine whether it should become a permanent -feature.) @end deffn @deffn {Directive} %code @var{qualifier} @{@var{code}@} @@ -4834,10 +4827,6 @@ before any class definitions. @end itemize @end itemize -(Like all the Yacc prologue alternatives, this directive is experimental. -More user feedback will help to determine whether it should become a permanent -feature.) - @cindex Prologue For a detailed discussion of how to use @code{%code} in place of the traditional Yacc prologue for C/C++, see @ref{Prologue Alternatives}. @@ -4896,8 +4885,8 @@ Some of the accepted @var{variable}s are: @end itemize @c api.pure -@item api.push_pull -@findex %define api.push_pull +@item api.push-pull +@findex %define api.push-pull @itemize @bullet @item Language(s): C (deterministic parsers only) @@ -4911,7 +4900,7 @@ More user feedback will help to stabilize it.) @item Default Value: @code{"pull"} @end itemize -@c api.push_pull +@c api.push-pull @item error-verbose @findex %define error-verbose @@ -4930,9 +4919,9 @@ Boolean @c error-verbose -@item lr.default_reductions +@item lr.default-reductions @cindex default reductions -@findex %define lr.default_reductions +@findex %define lr.default-reductions @cindex delayed syntax errors @cindex syntax errors delayed @@ -4995,8 +4984,8 @@ without performing any extra reductions. @end itemize @end itemize -@item lr.keep_unreachable_states -@findex %define lr.keep_unreachable_states +@item lr.keep-unreachable-states +@findex %define lr.keep-unreachable-states @itemize @bullet @item Language(s): all @@ -5037,7 +5026,7 @@ states. However, Bison does not compute which goto actions are useless. @end itemize @end itemize -@c lr.keep_unreachable_states +@c lr.keep-unreachable-states @item lr.type @findex %define lr.type @@ -5108,7 +5097,7 @@ syntactically acceptable in that left context. Thus, the only difference in parsing behavior is that the canonical @acronym{LR} parser can report a syntax error as soon as possible without performing any unnecessary reductions. -@xref{Decl Summary,,lr.default_reductions}, for further details. +@xref{Decl Summary,,lr.default-reductions}, for further details. Even when canonical @acronym{LR} behavior is ultimately desired, @acronym{IELR}'s elimination of duplicate conflicts should still facilitate the development of a grammar. @@ -5201,10 +5190,47 @@ is not already defined, so that the debugging facilities are compiled. @item Default Value: @code{false} @end itemize -@end table @c parse.trace + +@item token.prefix +@findex %define token.prefix + +@itemize +@item Languages(s): all + +@item Purpose: +Add a prefix to the token names when generating their definition in the +target language. For instance + +@example +%token FILE for ERROR +%define token.prefix "TOK_" +%% +start: FILE for ERROR; +@end example + +@noindent +generates the definition of the symbols @code{TOK_FILE}, @code{TOK_for}, +and @code{TOK_ERROR} in the generated source files. In particular, the +scanner must use these prefixed token names, while the grammar itself +may still use the short names (as in the sample rule given above). The +generated informational files (@file{*.output}, @file{*.xml}, +@file{*.dot}) are not modified by this prefix. See @ref{Calc++ Parser} +and @ref{Calc++ Scanner}, for a complete example. + +@item Accepted Values: +Any string. Should be a valid identifier prefix in the target language, +in other words, it should typically be an identifier itself (sequence of +letters, underscores, and ---not at the beginning--- digits). + +@item Default Value: +empty +@end itemize +@c token.prefix + +@end table @end deffn -@c %define +@c ---------------------------------------------------------- %define @deffn {Directive} %defines Write a header file containing macro definitions for the token type @@ -5529,8 +5555,8 @@ exp: @dots{} @{ @dots{}; *randomness += 1; @dots{} @} More user feedback will help to stabilize it.) You call the function @code{yypush_parse} to parse a single token. This -function is available if either the @code{%define api.push_pull "push"} or -@code{%define api.push_pull "both"} declaration is used. +function is available if either the @code{%define api.push-pull "push"} or +@code{%define api.push-pull "both"} declaration is used. @xref{Push Decl, ,A Push Parser}. @deftypefun int yypush_parse (yypstate *yyps) @@ -5547,7 +5573,7 @@ is required to finish parsing the grammar. More user feedback will help to stabilize it.) You call the function @code{yypull_parse} to parse the rest of the input -stream. This function is available if the @code{%define api.push_pull "both"} +stream. This function is available if the @code{%define api.push-pull "both"} declaration is used. @xref{Push Decl, ,A Push Parser}. @@ -5563,8 +5589,8 @@ The value returned by @code{yypull_parse} is the same as for @code{yyparse}. More user feedback will help to stabilize it.) You call the function @code{yypstate_new} to create a new parser instance. -This function is available if either the @code{%define api.push_pull "push"} or -@code{%define api.push_pull "both"} declaration is used. +This function is available if either the @code{%define api.push-pull "push"} or +@code{%define api.push-pull "both"} declaration is used. @xref{Push Decl, ,A Push Parser}. @deftypefun yypstate *yypstate_new (void) @@ -5582,8 +5608,8 @@ allocated. More user feedback will help to stabilize it.) You call the function @code{yypstate_delete} to delete a parser instance. -function is available if either the @code{%define api.push_pull "push"} or -@code{%define api.push_pull "both"} declaration is used. +function is available if either the @code{%define api.push-pull "push"} or +@code{%define api.push-pull "both"} declaration is used. @xref{Push Decl, ,A Push Parser}. @deftypefun void yypstate_delete (yypstate *yyps) @@ -7473,8 +7499,8 @@ useless: STR; @command{bison} reports: @example -tmp.y: warning: 1 nonterminal useless in grammar -tmp.y: warning: 1 rule useless in grammar +calc.y: warning: 1 nonterminal useless in grammar +calc.y: warning: 1 rule useless in grammar calc.y:11.1-7: warning: nonterminal useless in grammar: useless calc.y:11.10-12: warning: rule useless in grammar: useless: STR calc.y: conflicts: 7 shift/reduce @@ -8788,13 +8814,14 @@ The code between @samp{%code @{} and @samp{@}} is output in the @noindent The token numbered as 0 corresponds to end of file; the following line -allows for nicer error messages referring to ``end of file'' instead -of ``$end''. Similarly user friendly named are provided for each -symbol. Note that the tokens names are prefixed by @code{TOKEN_} to -avoid name clashes. +allows for nicer error messages referring to ``end of file'' instead of +``$end''. Similarly user friendly names are provided for each symbol. +To avoid name clashes in the generated files (@pxref{Calc++ Scanner}), +prefix tokens with @code{TOK_} (@pxref{Decl Summary,, token.prefix}). @comment file: calc++-parser.yy @example +%define token.prefix "TOK_" %token END 0 "end of file" %token ASSIGN ":=" %token IDENTIFIER "identifier" @@ -8824,22 +8851,24 @@ The grammar itself is straightforward. %start unit; unit: assignments exp @{ driver.result = $2; @}; -assignments: assignments assignment @{@} - | /* Nothing. */ @{@}; +assignments: + assignments assignment @{@} +| /* Nothing. */ @{@}; assignment: - "identifier" ":=" exp + "identifier" ":=" exp @{ driver.variables[*$1] = $3; delete $1; @}; %left '+' '-'; %left '*' '/'; -exp: exp '+' exp @{ $$ = $1 + $3; @} - | exp '-' exp @{ $$ = $1 - $3; @} - | exp '*' exp @{ $$ = $1 * $3; @} - | exp '/' exp @{ $$ = $1 / $3; @} - | '(' exp ')' @{ $$ = $2; @} - | "identifier" @{ $$ = driver.variables[*$1]; delete $1; @} - | "number" @{ $$ = $1; @}; +exp: + exp '+' exp @{ $$ = $1 + $3; @} +| exp '-' exp @{ $$ = $1 - $3; @} +| exp '*' exp @{ $$ = $1 * $3; @} +| exp '/' exp @{ $$ = $1 / $3; @} +| '(' exp ')' @{ $$ = $2; @} +| "identifier" @{ $$ = driver.variables[*$1]; delete $1; @} +| "number" @{ $$ = $1; @}; %% @end example @@ -8880,10 +8909,10 @@ parser's to get the set of defined tokens. # undef yywrap # define yywrap() 1 -/* By default yylex returns int, we use token_type. - Unfortunately yyterminate by default returns 0, which is +/* By default yylex returns an int; we use token_type. + The default yyterminate implementation returns 0, which is not of token_type. */ -#define yyterminate() return token::END +#define yyterminate() return TOKEN(END) %@} @end example @@ -8931,28 +8960,32 @@ preceding tokens. Comments would be treated equally. @end example @noindent -The rules are simple, just note the use of the driver to report errors. -It is convenient to use a typedef to shorten -@code{yy::calcxx_parser::token::identifier} into -@code{token::identifier} for instance. +The rules are simple. The driver is used to report errors. It is +convenient to use a macro to shorten +@code{yy::calcxx_parser::token::TOK_@var{Name}} into +@code{TOKEN(@var{Name})}; note the token prefix, @code{TOK_}. @comment file: calc++-scanner.ll @example %@{ - typedef yy::calcxx_parser::token token; +# define TOKEN(Name) \ + yy::calcxx_parser::token::TOK_ ## Name %@} /* Convert ints to the actual type of tokens. */ [-+*/()] return yy::calcxx_parser::token_type (yytext[0]); -":=" return token::ASSIGN; +":=" return TOKEN(ASSIGN); @{int@} @{ errno = 0; long n = strtol (yytext, NULL, 10); if (! (INT_MIN <= n && n <= INT_MAX && errno != ERANGE)) driver.error (*yylloc, "integer is out of range"); yylval->ival = n; - return token::NUMBER; + return TOKEN(NUMBER); +@} +@{id@} @{ + yylval->sval = new std::string (yytext); + return TOKEN(IDENTIFIER); @} -@{id@} yylval->sval = new std::string (yytext); return token::IDENTIFIER; . driver.error (*yylloc, "invalid character"); %% @end example @@ -9054,7 +9087,7 @@ and @code{%define api.pure} directives does not do anything when used in Java. Push parsers are currently unsupported in Java and @code{%define -api.push_pull} have no effect. +api.push-pull} have no effect. @acronym{GLR} parsers are currently unsupported in Java. Do not use the @code{glr-parser} directive. @@ -10526,7 +10559,7 @@ committee document contributing to what became the Algol 60 report. @item Consistent State A state containing only one possible action. -@xref{Decl Summary,,lr.default_reductions}. +@xref{Decl Summary,,lr.default-reductions}. @item Context-free grammars Grammars specified as rules that can be applied regardless of context. @@ -10541,7 +10574,7 @@ contains no other action for the lookahead token. In permitted parser states, Bison declares the reduction with the largest lookahead set to be the default reduction and removes that lookahead set. -@xref{Decl Summary,,lr.default_reductions}. +@xref{Decl Summary,,lr.default-reductions}. @item Dynamic allocation Allocation of memory that occurs during execution, rather than at