X-Git-Url: https://git.saurik.com/bison.git/blobdiff_plain/c827f760f640ce7a47ed50101921c3a85adf8cde..feeb0edaf114181320636090d082cef9cd144da2:/doc/bison.texinfo diff --git a/doc/bison.texinfo b/doc/bison.texinfo index 907c3308..ef803899 100644 --- a/doc/bison.texinfo +++ b/doc/bison.texinfo @@ -208,6 +208,7 @@ Bison Declarations * Precedence Decl:: Declaring terminals with precedence and associativity. * Union Decl:: Declaring the set of all semantic value types. * Type Decl:: Declaring the choice of type for a nonterminal symbol. +* Destructor Decl:: Declaring how symbols are freed. * Expect Decl:: Suppressing warnings about shift/reduce conflicts. * Start Decl:: Specifying the start symbol. * Pure Decl:: Requesting a reentrant parser. @@ -268,7 +269,6 @@ Invoking Bison * Bison Options:: All the options described in detail, in alphabetical order by short options. * Option Cross Key:: Alphabetical list of long options. -* VMS Invocation:: Bison command syntax on @acronym{VMS}. Frequently Asked Questions @@ -412,42 +412,41 @@ more information on this. @cindex generalized @acronym{LR} (@acronym{GLR}) parsing @cindex ambiguous grammars @cindex non-deterministic parsing -Parsers for @acronym{LALR}(1) grammars are @dfn{deterministic}, -meaning roughly that -the next grammar rule to apply at any point in the input is uniquely -determined by the preceding input and a fixed, finite portion (called -a @dfn{look-ahead}) of the remaining input. -A context-free grammar can be @dfn{ambiguous}, meaning that -there are multiple ways to apply the grammar rules to get the some inputs. -Even unambiguous grammars can be @dfn{non-deterministic}, meaning that no -fixed look-ahead always suffices to determine the next grammar rule to apply. -With the proper declarations, Bison is also able to parse these more general -context-free grammars, using a technique known as @acronym{GLR} parsing (for -Generalized @acronym{LR}). Bison's @acronym{GLR} parsers are able to -handle any context-free -grammar for which the number of possible parses of any given string -is finite. + +Parsers for @acronym{LALR}(1) grammars are @dfn{deterministic}, meaning +roughly that the next grammar rule to apply at any point in the input is +uniquely determined by the preceding input and a fixed, finite portion +(called a @dfn{look-ahead}) of the remaining input. A context-free +grammar can be @dfn{ambiguous}, meaning that there are multiple ways to +apply the grammar rules to get the some inputs. Even unambiguous +grammars can be @dfn{non-deterministic}, meaning that no fixed +look-ahead always suffices to determine the next grammar rule to apply. +With the proper declarations, Bison is also able to parse these more +general context-free grammars, using a technique known as @acronym{GLR} +parsing (for Generalized @acronym{LR}). Bison's @acronym{GLR} parsers +are able to handle any context-free grammar for which the number of +possible parses of any given string is finite. @cindex symbols (abstract) @cindex token @cindex syntactic grouping @cindex grouping, syntactic -In the formal grammatical rules for a language, each kind of syntactic unit -or grouping is named by a @dfn{symbol}. Those which are built by grouping -smaller constructs according to grammatical rules are called +In the formal grammatical rules for a language, each kind of syntactic +unit or grouping is named by a @dfn{symbol}. Those which are built by +grouping smaller constructs according to grammatical rules are called @dfn{nonterminal symbols}; those which can't be subdivided are called @dfn{terminal symbols} or @dfn{token types}. We call a piece of input corresponding to a single terminal symbol a @dfn{token}, and a piece corresponding to a single nonterminal symbol a @dfn{grouping}. We can use the C language as an example of what symbols, terminal and -nonterminal, mean. The tokens of C are identifiers, constants (numeric and -string), and the various keywords, arithmetic operators and punctuation -marks. So the terminal symbols of a grammar for C include `identifier', -`number', `string', plus one symbol for each keyword, operator or -punctuation mark: `if', `return', `const', `static', `int', `char', -`plus-sign', `open-brace', `close-brace', `comma' and many more. (These -tokens can be subdivided into characters, but that is a matter of +nonterminal, mean. The tokens of C are identifiers, constants (numeric +and string), and the various keywords, arithmetic operators and +punctuation marks. So the terminal symbols of a grammar for C include +`identifier', `number', `string', plus one symbol for each keyword, +operator or punctuation mark: `if', `return', `const', `static', `int', +`char', `plus-sign', `open-brace', `close-brace', `comma' and many more. +(These tokens can be subdivided into characters, but that is a matter of lexicography, not grammar.) Here is a simple C function subdivided into tokens: @@ -642,28 +641,28 @@ from the values of the two subexpressions. @cindex conflicts @cindex shift/reduce conflicts -In some grammars, there will be cases where Bison's standard @acronym{LALR}(1) -parsing algorithm cannot decide whether to apply a certain grammar rule -at a given point. That is, it may not be able to decide (on the basis -of the input read so far) which of two possible reductions (applications -of a grammar rule) applies, or whether to apply a reduction or read more -of the input and apply a reduction later in the input. These are known -respectively as @dfn{reduce/reduce} conflicts (@pxref{Reduce/Reduce}), -and @dfn{shift/reduce} conflicts (@pxref{Shift/Reduce}). - -To use a grammar that is not easily modified to be @acronym{LALR}(1), a more -general parsing algorithm is sometimes necessary. If you include +In some grammars, there will be cases where Bison's standard +@acronym{LALR}(1) parsing algorithm cannot decide whether to apply a +certain grammar rule at a given point. That is, it may not be able to +decide (on the basis of the input read so far) which of two possible +reductions (applications of a grammar rule) applies, or whether to apply +a reduction or read more of the input and apply a reduction later in the +input. These are known respectively as @dfn{reduce/reduce} conflicts +(@pxref{Reduce/Reduce}), and @dfn{shift/reduce} conflicts +(@pxref{Shift/Reduce}). + +To use a grammar that is not easily modified to be @acronym{LALR}(1), a +more general parsing algorithm is sometimes necessary. If you include @code{%glr-parser} among the Bison declarations in your file -(@pxref{Grammar Outline}), the result will be a Generalized -@acronym{LR} (@acronym{GLR}) -parser. These parsers handle Bison grammars that contain no unresolved -conflicts (i.e., after applying precedence declarations) identically to -@acronym{LALR}(1) parsers. However, when faced with unresolved -shift/reduce and reduce/reduce conflicts, @acronym{GLR} parsers use -the simple expedient of doing -both, effectively cloning the parser to follow both possibilities. Each -of the resulting parsers can again split, so that at any given time, -there can be any number of possible parses being explored. The parsers +(@pxref{Grammar Outline}), the result will be a Generalized @acronym{LR} +(@acronym{GLR}) parser. These parsers handle Bison grammars that +contain no unresolved conflicts (i.e., after applying precedence +declarations) identically to @acronym{LALR}(1) parsers. However, when +faced with unresolved shift/reduce and reduce/reduce conflicts, +@acronym{GLR} parsers use the simple expedient of doing both, +effectively cloning the parser to follow both possibilities. Each of +the resulting parsers can again split, so that at any given time, there +can be any number of possible parses being explored. The parsers proceed in lockstep; that is, all of them consume (shift) a given input symbol before any of them proceed to the next. Each of the cloned parsers eventually meets one of two possible fates: either it runs into @@ -682,7 +681,7 @@ involved, or by performing both actions, and then calling a designated user-defined function on the resulting values to produce an arbitrary merged result. -Let's consider an example, vastly simplified from C++. +Let's consider an example, vastly simplified from a C++ grammar. @example %@{ @@ -706,20 +705,20 @@ stmt : expr ';' %dprec 1 | decl %dprec 2 ; -expr : ID @{ printf ("%s ", $$); @} +expr : ID @{ printf ("%s ", $$); @} | TYPENAME '(' expr ')' - @{ printf ("%s ", $1); @} - | expr '+' expr @{ printf ("+ "); @} - | expr '=' expr @{ printf ("= "); @} + @{ printf ("%s ", $1); @} + | expr '+' expr @{ printf ("+ "); @} + | expr '=' expr @{ printf ("= "); @} ; decl : TYPENAME declarator ';' - @{ printf ("%s ", $1); @} + @{ printf ("%s ", $1); @} | TYPENAME declarator '=' expr ';' - @{ printf ("%s ", $1); @} + @{ printf ("%s ", $1); @} ; -declarator : ID @{ printf ("\"%s\" ", $1); @} +declarator : ID @{ printf ("\"%s\" ", $1); @} | '(' declarator ')' ; @end example @@ -810,6 +809,32 @@ as both an @code{expr} and a @code{decl}, and print "x" y z + T x T y z + = @end example +@sp 1 + +@cindex @code{incline} +@cindex @acronym{GLR} parsers and @code{inline} +Note that the @acronym{GLR} parsers require an ISO C89 compiler. In +addition, they use the @code{inline} keyword, which is not C89, but a +common extension. It is up to the user of these parsers to handle +portability issues. For instance, if using Autoconf and the Autoconf +macro @code{AC_C_INLINE}, a mere + +@example +%@{ +#include +%@} +@end example + +@noindent +will suffice. Otherwise, we suggest + +@example +%@{ +#if ! defined __GNUC__ && ! defined inline +# define inline +#endif +%@} +@end example @node Locations Overview @section Locations @@ -1088,18 +1113,18 @@ input: /* empty */ ; line: '\n' - | exp '\n' @{ printf ("\t%.10g\n", $1); @} + | exp '\n' @{ printf ("\t%.10g\n", $1); @} ; -exp: NUM @{ $$ = $1; @} - | exp exp '+' @{ $$ = $1 + $2; @} - | exp exp '-' @{ $$ = $1 - $2; @} - | exp exp '*' @{ $$ = $1 * $2; @} - | exp exp '/' @{ $$ = $1 / $2; @} - /* Exponentiation */ - | exp exp '^' @{ $$ = pow ($1, $2); @} - /* Unary minus */ - | exp 'n' @{ $$ = -$1; @} +exp: NUM @{ $$ = $1; @} + | exp exp '+' @{ $$ = $1 + $2; @} + | exp exp '-' @{ $$ = $1 - $2; @} + | exp exp '*' @{ $$ = $1 * $2; @} + | exp exp '/' @{ $$ = $1 / $2; @} + /* Exponentiation */ + | exp exp '^' @{ $$ = pow ($1, $2); @} + /* Unary minus */ + | exp 'n' @{ $$ = -$1; @} ; %% @end example @@ -1348,7 +1373,7 @@ main (void) When @code{yyparse} detects a syntax error, it calls the error reporting function @code{yyerror} to print an error message (usually but not -always @code{"parse error"}). It is up to the programmer to supply +always @code{"syntax error"}). It is up to the programmer to supply @code{yyerror} (@pxref{Interface, ,Parser C-Language Interface}), so here is the definition we will use: @@ -1357,7 +1382,7 @@ here is the definition we will use: #include void -yyerror (const char *s) /* called by yyparse on error */ +yyerror (const char *s) /* Called by yyparse on error. */ @{ printf ("%s\n", s); @} @@ -1558,7 +1583,7 @@ line: '\n' @end example This addition to the grammar allows for simple error recovery in the -event of a parse error. If an expression that cannot be evaluated is +event of a syntax error. If an expression that cannot be evaluated is read, the error will be recognized by the third rule for @code{line}, and parsing will continue. (The @code{yyerror} function is still called upon to print its message as well.) The action executes the statement @@ -1706,11 +1731,15 @@ int yylex (void) @{ int c; +@end group +@group /* Skip white space. */ while ((c = getchar ()) == ' ' || c == '\t') ++yylloc.last_column; +@end group +@group /* Step. */ yylloc.first_line = yylloc.last_line; yylloc.first_column = yylloc.last_column; @@ -1832,27 +1861,30 @@ Note that multiple assignment and nested function calls are permitted. Here are the C and Bison declarations for the multi-function calculator. @smallexample +@group %@{ #include /* For math functions, cos(), sin(), etc. */ -#include "calc.h" /* Contains definition of `symrec' */ +#include "calc.h" /* Contains definition of `symrec' */ %@} +@end group +@group %union @{ -double val; /* For returning numbers. */ -symrec *tptr; /* For returning symbol-table pointers */ + double val; /* For returning numbers. */ + symrec *tptr; /* For returning symbol-table pointers. */ @} - -%token NUM /* Simple double precision number */ -%token VAR FNCT /* Variable and Function */ +@end group +%token NUM /* Simple double precision number. */ +%token VAR FNCT /* Variable and Function. */ %type exp +@group %right '=' %left '-' '+' %left '*' '/' %left NEG /* Negation--unary minus */ %right '^' /* Exponentiation */ - +@end group /* Grammar follows */ - %% @end smallexample @@ -1886,16 +1918,21 @@ Most of them are copied directly from @code{calc}; three rules, those which mention @code{VAR} or @code{FNCT}, are new. @smallexample +@group input: /* empty */ | input line ; +@end group +@group line: '\n' | exp '\n' @{ printf ("\t%.10g\n", $1); @} | error '\n' @{ yyerrok; @} ; +@end group +@group exp: NUM @{ $$ = $1; @} | VAR @{ $$ = $1->value.var; @} | VAR '=' exp @{ $$ = $3; $1->value.var = $3; @} @@ -1908,6 +1945,7 @@ exp: NUM @{ $$ = $1; @} | exp '^' exp @{ $$ = pow ($1, $3); @} | '(' exp ')' @{ $$ = $2; @} ; +@end group /* End of grammar */ %% @end smallexample @@ -1929,7 +1967,9 @@ provides for either functions or variables to be placed in the table. @group /* Function type. */ typedef double (*func_t) (double); +@end group +@group /* Data type for links in the chain of symbols. */ struct symrec @{ @@ -1960,9 +2000,9 @@ function that initializes the symbol table. Here it is, and @code{init_table} as well: @smallexample -@group #include +@group int main (void) @{ @@ -1973,11 +2013,13 @@ main (void) @group void -yyerror (const char *s) /* Called by yyparse on error */ +yyerror (const char *s) /* Called by yyparse on error. */ @{ printf ("%s\n", s); @} +@end group +@group struct init @{ char *fname; @@ -1996,7 +2038,9 @@ struct init arith_fncts[] = "sqrt", sqrt, 0, 0 @}; +@end group +@group /* The symbol table: a chain of `struct symrec'. */ symrec *sym_table = (symrec *) 0; @end group @@ -2072,7 +2116,9 @@ operators in @code{yylex}. @smallexample @group #include +@end group +@group int yylex (void) @{ @@ -2120,7 +2166,7 @@ yylex (void) if (i == length) @{ length *= 2; - symbuf = (char *)realloc (symbuf, length + 1); + symbuf = (char *) realloc (symbuf, length + 1); @} /* Add this character to the buffer. */ symbuf[i++] = c; @@ -2212,6 +2258,8 @@ appropriate delimiters: @end example Comments enclosed in @samp{/* @dots{} */} may appear in any of the sections. +As a @acronym{GNU} extension, @samp{//} introduces a comment that +continues until end of line. @menu * Prologue:: Syntax and usage of the prologue. @@ -2254,8 +2302,8 @@ can be done with two @var{Prologue} blocks, one before and one after the @} %@{ -static void yyprint(FILE *, int, YYSTYPE); -#define YYPRINT(F, N, L) yyprint(F, N, L) +static void print_token_value (FILE *, int, YYSTYPE); +#define YYPRINT(F, N, L) print_token_value (F, N, L) %@} @dots{} @@ -2360,7 +2408,9 @@ All the usual escape sequences used in character literals in C can be used in Bison as well, but you must not use the null character as a character literal because its numeric code, zero, signifies end-of-input (@pxref{Calling Convention, ,Calling Convention -for @code{yylex}}). +for @code{yylex}}). Also, unlike standard C, trigraphs have no +special meaning in Bison character literals, nor is backslash-newline +allowed. @item @cindex string token @@ -2387,9 +2437,10 @@ does not enforce this convention, but if you depart from it, people who read your program will be confused. All the escape sequences used in string literals in C can be used in -Bison as well. A literal string token must contain two or more -characters; for a token containing just one character, use a character -token (see above). +Bison as well. However, unlike Standard C, trigraphs have no special +meaning in Bison string literals, nor is backslash-newline allowed. A +literal string token must contain two or more characters; for a token +containing just one character, use a character token (see above). @end itemize How you choose to write a terminal symbol has no effect on its @@ -2691,7 +2742,13 @@ is to compute a semantic value for the grouping built by the rule from the semantic values associated with tokens or smaller groupings. An action consists of C statements surrounded by braces, much like a -compound statement in C@. It can be placed at any position in the rule; +compound statement in C@. An action can contain any sequence of C +statements. Bison does not look for trigraphs, though, so if your C +code uses trigraphs you should ensure that they do not affect the +nesting of braces or the boundaries of comments, strings, or character +literals. + +An action can be placed at any position in the rule; it is executed at that position. Most rules have just one action at the end of the rule, following all the components. Actions in the middle of a rule are tricky and used only for special purposes (@pxref{Mid-Rule @@ -2738,11 +2795,11 @@ a-or-b: 'a'|'b' @{ a_or_b_found = 1; @}; @cindex default action If you don't specify an action for a rule, Bison supplies a default: -@w{@code{$$ = $1}.} Thus, the value of the first symbol in the rule becomes -the value of the whole rule. Of course, the default rule is valid only -if the two data types match. There is no meaningful default action for -an empty rule; every empty rule must have an explicit action unless the -rule's value does not matter. +@w{@code{$$ = $1}.} Thus, the value of the first symbol in the rule +becomes the value of the whole rule. Of course, the default action is +valid only if the two data types match. There is no meaningful default +action for an empty rule; every empty rule must have an explicit action +unless the rule's value does not matter. @code{$@var{n}} with @var{n} zero or negative is allowed for reference to tokens and groupings on the stack @emph{before} those that match the @@ -3164,6 +3221,7 @@ Grammars}). * Precedence Decl:: Declaring terminals with precedence and associativity. * Union Decl:: Declaring the set of all semantic value types. * Type Decl:: Declaring the choice of type for a nonterminal symbol. +* Destructor Decl:: Declaring how symbols are freed. * Expect Decl:: Suppressing warnings about shift/reduce conflicts. * Start Decl:: Specifying the start symbol. * Pure Decl:: Requesting a reentrant parser. @@ -3354,6 +3412,71 @@ use the same @code{<@var{type}>} construction in a declaration for the terminal symbol. All kinds of token declarations allow @code{<@var{type}>}. +@node Destructor Decl +@subsection Freeing Discarded Symbols +@cindex freeing discarded symbols +@findex %destructor + +Some symbols can be discarded by the parser, typically during error +recovery (@pxref{Error Recovery}). Basically, during error recovery, +embarrassing symbols already pushed on the stack, and embarrassing +tokens coming from the rest of the file are thrown away until the parser +falls on its feet. If these symbols convey heap based information, this +memory is lost. While this behavior is tolerable for batch parsers, +such as in compilers, it is unacceptable for parsers that can +possibility ``never end'' such as shells, or implementations of +communication protocols. + +The @code{%destructor} directive allows for the definition of code that +is called when a symbol is thrown away. + +@deffn {Directive} %destructor @{ @var{code} @} @var{symbols} +@findex %destructor +Declare that the @var{code} must be invoked for each of the +@var{symbols} that will be discarded by the parser. The @var{code} +should use @code{$$} to designate the semantic value associated to the +@var{symbols}. The additional parser parameters are also avaible +(@pxref{Parser Function, , The Parser Function @code{yyparse}}). + +@strong{Warning:} as of Bison 1.875, this feature is still considered as +experimental, as there was not enough users feedback. In particular, +the syntax might still change. +@end deffn + +For instance: + +@smallexample +%union +@{ + char *string; +@} +%token STRING +%type string +%destructor @{ free ($$); @} STRING string +@end smallexample + +@noindent +guarantees that when a @code{STRING} or a @code{string} will be discarded, +its associated memory will be freed. + +Note that in the future, Bison might also consider that right hand side +members that are not mentioned in the action can be destroyed. For +instance, in: + +@smallexample +comment: "/*" STRING "*/"; +@end smallexample + +@noindent +the parser is entitled to destroy the semantic value of the +@code{string}. Of course, this will not apply to the default action; +compare: + +@smallexample +typeless: string; // $$ = $1 does not apply; $1 is destroyed. +typefull: string; // $$ = $1 applies, $1 is not destroyed. +@end smallexample + @node Expect Decl @subsection Suppressing Conflict Warnings @cindex suppressing conflict warnings @@ -3466,53 +3589,60 @@ valid grammar. Here is a summary of the declarations used to define a grammar: -@table @code -@item %union +@deffn {Directive} %union Declare the collection of data types that semantic values may have (@pxref{Union Decl, ,The Collection of Value Types}). +@end deffn -@item %token +@deffn {Directive} %token Declare a terminal symbol (token type name) with no precedence or associativity specified (@pxref{Token Decl, ,Token Type Names}). +@end deffn -@item %right +@deffn {Directive} %right Declare a terminal symbol (token type name) that is right-associative (@pxref{Precedence Decl, ,Operator Precedence}). +@end deffn -@item %left +@deffn {Directive} %left Declare a terminal symbol (token type name) that is left-associative (@pxref{Precedence Decl, ,Operator Precedence}). +@end deffn -@item %nonassoc +@deffn {Directive} %nonassoc Declare a terminal symbol (token type name) that is nonassociative (using it in a way that would be associative is a syntax error) +@end deffn (@pxref{Precedence Decl, ,Operator Precedence}). -@item %type +@deffn {Directive} %type Declare the type of semantic values for a nonterminal symbol (@pxref{Type Decl, ,Nonterminal Symbols}). +@end deffn -@item %start +@deffn {Directive} %start Specify the grammar's start symbol (@pxref{Start Decl, ,The Start-Symbol}). +@end deffn -@item %expect +@deffn {Directive} %expect Declare the expected number of shift-reduce conflicts (@pxref{Expect Decl, ,Suppressing Conflict Warnings}). -@end table +@end deffn + @sp 1 @noindent In order to change the behavior of @command{bison}, use the following directives: -@table @code -@item %debug +@deffn {Directive} %debug In the parser file, define the macro @code{YYDEBUG} to 1 if it is not already defined, so that the debugging facilities are compiled. +@end deffn @xref{Tracing, ,Tracing Your Parser}. -@item %defines +@deffn {Directive} %defines Write an extra output file containing macro definitions for the token type names defined in the grammar and the semantic value type @code{YYSTYPE}, as well as a few @code{extern} variable declarations. @@ -3524,36 +3654,38 @@ This output file is essential if you wish to put the definition of @code{yylex} in a separate source file, because @code{yylex} needs to be able to refer to token type codes and the variable @code{yylval}. @xref{Token Values, ,Semantic Values of Tokens}. +@end deffn -@item %file-prefix="@var{prefix}" +@deffn {Directive} %destructor +Specifying how the parser should reclaim the memory associated to +discarded symbols. @xref{Destructor Decl, , Freeing Discarded Symbols}. +@end deffn + +@deffn {Directive} %file-prefix="@var{prefix}" Specify a prefix to use for all Bison output file names. The names are chosen as if the input file were named @file{@var{prefix}.y}. +@end deffn -@c @item %header-extension -@c Specify the extension of the parser header file generated when -@c @code{%define} or @samp{-d} are used. -@c -@c For example, a grammar file named @file{foo.ypp} and containing a -@c @code{%header-extension .hh} directive will produce a header file -@c named @file{foo.tab.hh} - -@item %locations +@deffn {Directive} %locations Generate the code processing the locations (@pxref{Action Features, ,Special Features for Use in Actions}). This mode is enabled as soon as the grammar uses the special @samp{@@@var{n}} tokens, but if your grammar does not use it, using @samp{%locations} allows for more -accurate parse error messages. +accurate syntax error messages. +@end deffn -@item %name-prefix="@var{prefix}" +@deffn {Directive} %name-prefix="@var{prefix}" Rename the external symbols used in the parser so that they start with @var{prefix} instead of @samp{yy}. The precise list of symbols renamed is @code{yyparse}, @code{yylex}, @code{yyerror}, @code{yynerrs}, -@code{yylval}, @code{yychar}, @code{yydebug}, and possible -@code{yylloc}. For example, if you use @samp{%name-prefix="c_"}, the -names become @code{c_parse}, @code{c_lex}, and so on. @xref{Multiple -Parsers, ,Multiple Parsers in the Same Program}. - -@item %no-parser +@code{yylval}, @code{yylloc}, @code{yychar}, @code{yydebug}, and +possible @code{yylloc}. For example, if you use +@samp{%name-prefix="c_"}, the names become @code{c_parse}, @code{c_lex}, +and so on. @xref{Multiple Parsers, ,Multiple Parsers in the Same +Program}. +@end deffn + +@deffn {Directive} %no-parser Do not include any C code in the parser file; generate tables only. The parser file contains just @code{#define} directives and static variable declarations. @@ -3561,30 +3693,27 @@ declarations. This option also tells Bison to write the C code for the grammar actions into a file named @file{@var{filename}.act}, in the form of a brace-surrounded body fit for a @code{switch} statement. +@end deffn -@item %no-lines +@deffn {Directive} %no-lines Don't generate any @code{#line} preprocessor commands in the parser file. Ordinarily Bison writes these commands in the parser file so that the C compiler and debuggers will associate errors and object code with your source file (the grammar file). This directive causes them to associate errors with the parser file, treating it an independent source file in its own right. +@end deffn -@item %output="@var{filename}" +@deffn {Directive} %output="@var{filename}" Specify the @var{filename} for the parser file. +@end deffn -@item %pure-parser +@deffn {Directive} %pure-parser Request a pure (reentrant) parser program (@pxref{Pure Decl, ,A Pure (Reentrant) Parser}). +@end deffn -@c @item %source-extension -@c Specify the extension of the parser output file. -@c -@c For example, a grammar file named @file{foo.yy} and containing a -@c @code{%source-extension .cpp} directive will produce a parser file -@c named @file{foo.tab.cpp} - -@item %token-table +@deffn {Directive} %token-table Generate an array of token names in the parser file. The name of the array is @code{yytname}; @code{yytname[@var{i}]} is the name of the token whose internal Bison token code number is @var{i}. The first @@ -3616,21 +3745,19 @@ The number of grammar rules, @item YYNSTATES The number of parser states (@pxref{Parser States}). @end table +@end deffn -@item %verbose +@deffn {Directive} %verbose Write an extra output file containing verbose descriptions of the parser states and what is done for each type of look-ahead token in that state. @xref{Understanding, , Understanding Your Parser}, for more information. +@end deffn - - -@item %yacc +@deffn {Directive} %yacc Pretend the option @option{--yacc} was given, i.e., imitate Yacc, including its naming conventions. @xref{Bison Options}, for more. -@end table - - +@end deffn @node Multiple Parsers @@ -3648,9 +3775,9 @@ instead of @samp{yy}. You can use this to give each parser distinct names that do not conflict. The precise list of symbols renamed is @code{yyparse}, @code{yylex}, -@code{yyerror}, @code{yynerrs}, @code{yylval}, @code{yychar} and -@code{yydebug}. For example, if you use @samp{-p c}, the names become -@code{cparse}, @code{clex}, and so on. +@code{yyerror}, @code{yynerrs}, @code{yylval}, @code{yylloc}, +@code{yychar} and @code{yydebug}. For example, if you use @samp{-p c}, +the names become @code{cparse}, @code{clex}, and so on. @strong{All the other variables and macros associated with Bison are not renamed.} These others are not global; there is no conflict if the same @@ -3695,23 +3822,66 @@ encounters end-of-input or an unrecoverable syntax error. You can also write an action which directs @code{yyparse} to return immediately without reading further. + +@deftypefun int yyparse (void) The value returned by @code{yyparse} is 0 if parsing was successful (return is due to end-of-input). The value is 1 if parsing failed (return is due to a syntax error). +@end deftypefun In an action, you can cause immediate return from @code{yyparse} by using these macros: -@table @code -@item YYACCEPT +@defmac YYACCEPT @findex YYACCEPT Return immediately with value 0 (to report success). +@end defmac -@item YYABORT +@defmac YYABORT @findex YYABORT Return immediately with value 1 (to report failure). -@end table +@end defmac + +If you use a reentrant parser, you can optionally pass additional +parameter information to it in a reentrant way. To do so, use the +declaration @code{%parse-param}: + +@deffn {Directive} %parse-param @{@var{argument-declaration}@} +@findex %parse-param +Declare that an argument declared by @code{argument-declaration} is an +additional @code{yyparse} argument. This argument is also passed to +@code{yyerror}. The @var{argument-declaration} is used when declaring +functions or prototypes. The last identifier in +@var{argument-declaration} must be the argument name. +@end deffn + +Here's an example. Write this in the parser: + +@example +%parse-param @{int *nastiness@} +%parse-param @{int *randomness@} +@end example + +@noindent +Then call the parser like this: + +@example +@{ + int nastiness, randomness; + @dots{} /* @r{Store proper data in @code{nastiness} and @code{randomness}.} */ + value = yyparse (&nastiness, &randomness); + @dots{} +@} +@end example + +@noindent +In the grammar actions, use expressions like this to refer to the data: + +@example +exp: @dots{} @{ @dots{}; *randomness += 1; @dots{} @} +@end example + @node Lexical @section The Lexical Analyzer Function @code{yylex} @@ -3916,85 +4086,47 @@ textual positions, then the type @code{YYLTYPE} will not be defined. In this case, omit the second argument; @code{yylex} will be called with only one argument. -@vindex YYPARSE_PARAM -If you use a reentrant parser, you can optionally pass additional -parameter information to it in a reentrant way. To do so, define the -macro @code{YYPARSE_PARAM} as a variable name. This modifies the -@code{yyparse} function to accept one argument, of type @code{void *}, -with that name. -When you call @code{yyparse}, pass the address of an object, casting the -address to @code{void *}. The grammar actions can refer to the contents -of the object by casting the pointer value back to its proper type and -then dereferencing it. Here's an example. Write this in the parser: +If you wish to pass the additional parameter data to @code{yylex}, use +@code{%lex-param} just like @code{%parse-param} (@pxref{Parser +Function}). -@example -%@{ -struct parser_control -@{ - int nastiness; - int randomness; -@}; +@deffn {Directive} lex-param @{@var{argument-declaration}@} +@findex %lex-param +Declare that @code{argument-declaration} is an additional @code{yylex} +argument declaration. +@end deffn -#define YYPARSE_PARAM parm -%@} -@end example - -@noindent -Then call the parser like this: +For instance: @example -struct parser_control -@{ - int nastiness; - int randomness; -@}; - -@dots{} - -@{ - struct parser_control foo; - @dots{} /* @r{Store proper data in @code{foo}.} */ - value = yyparse ((void *) &foo); - @dots{} -@} +%parse-param @{int *nastiness@} +%lex-param @{int *nastiness@} +%parse-param @{int *randomness@} @end example @noindent -In the grammar actions, use expressions like this to refer to the data: +results in the following signature: @example -((struct parser_control *) parm)->randomness +int yylex (int *nastiness); +int yyparse (int *nastiness, int *randomness); @end example -@vindex YYLEX_PARAM -If you wish to pass the additional parameter data to @code{yylex}, -define the macro @code{YYLEX_PARAM} just like @code{YYPARSE_PARAM}, as -shown here: +If @code{%pure-parser} is added: @example -%@{ -struct parser_control -@{ - int nastiness; - int randomness; -@}; - -#define YYPARSE_PARAM parm -#define YYLEX_PARAM parm -%@} +int yylex (YYSTYPE *lvalp, int *nastiness); +int yyparse (int *nastiness, int *randomness); @end example -You should then define @code{yylex} to accept one additional -argument---the value of @code{parm}. (This makes either two or three -arguments in total, depending on whether an argument of type -@code{YYLTYPE} is passed.) You can declare the argument as a pointer to -the proper object type, or you can declare it as @code{void *} and -access the contents as shown above. +@noindent +and finally, if both @code{%pure-parser} and @code{%locations} are used: -You can use @samp{%pure-parser} to request a reentrant parser without -also using @code{YYPARSE_PARAM}. Then you should call @code{yyparse} -with no arguments, as usual. +@example +int yylex (YYSTYPE *lvalp, YYLTYPE *llocp, int *nastiness); +int yyparse (int *nastiness, int *randomness); +@end example @node Error Reporting @section The Error Reporting Function @code{yyerror} @@ -4003,7 +4135,7 @@ with no arguments, as usual. @cindex parse error @cindex syntax error -The Bison parser detects a @dfn{parse error} or @dfn{syntax error} +The Bison parser detects a @dfn{syntax error} or @dfn{parse error} whenever it reads a token which cannot satisfy any syntax rule. An action in the grammar can also explicitly proclaim an error, using the macro @code{YYERROR} (@pxref{Action Features, ,Special Features for Use @@ -4012,16 +4144,14 @@ in Actions}). The Bison parser expects to report the error by calling an error reporting function named @code{yyerror}, which you must supply. It is called by @code{yyparse} whenever a syntax error is found, and it -receives one argument. For a parse error, the string is normally -@w{@code{"parse error"}}. +receives one argument. For a syntax error, the string is normally +@w{@code{"syntax error"}}. -@findex YYERROR_VERBOSE -If you define the macro @code{YYERROR_VERBOSE} in the Bison declarations -section (@pxref{Bison Declarations, ,The Bison Declarations Section}), -then Bison provides a more verbose and specific error message string -instead of just plain @w{@code{"parse error"}}. It doesn't matter what -definition you use for @code{YYERROR_VERBOSE}, just whether you define -it. +@findex %error-verbose +If you invoke the directive @code{%error-verbose} in the Bison +declarations section (@pxref{Bison Declarations, ,The Bison Declarations +Section}), then Bison provides a more verbose and specific error message +string instead of just plain @w{@code{"syntax error"}}. The parser can detect one other kind of error: stack overflow. This happens when the input contains constructions that are very deeply @@ -4036,7 +4166,7 @@ The following definition suffices in simple programs: @example @group void -yyerror (char *s) +yyerror (const char *s) @{ @end group @group @@ -4050,6 +4180,57 @@ error recovery if you have written suitable error recovery grammar rules (@pxref{Error Recovery}). If recovery is impossible, @code{yyparse} will immediately return 1. +Obviously, in location tracking pure parsers, @code{yyerror} should have +an access to the current location. This is indeed the case for the GLR +parsers, but not for the Yacc parser, for historical reasons. I.e., if +@samp{%locations %pure-parser} is passed then the prototypes for +@code{yyerror} are: + +@example +void yyerror (const char *msg); /* Yacc parsers. */ +void yyerror (YYLTYPE *locp, const char *msg); /* GLR parsers. */ +@end example + +If @samp{%parse-param @{int *nastiness@}} is used, then: + +@example +void yyerror (int *randomness, const char *msg); /* Yacc parsers. */ +void yyerror (int *randomness, const char *msg); /* GLR parsers. */ +@end example + +Finally, GLR and Yacc parsers share the same @code{yyerror} calling +convention for absolutely pure parsers, i.e., when the calling +convention of @code{yylex} @emph{and} the calling convention of +@code{%pure-parser} are pure. I.e.: + +@example +/* Location tracking. */ +%locations +/* Pure yylex. */ +%pure-parser +%lex-param @{int *nastiness@} +/* Pure yyparse. */ +%parse-param @{int *nastiness@} +%parse-param @{int *randomness@} +@end example + +@noindent +results in the following signatures for all the parser kinds: + +@example +int yylex (YYSTYPE *lvalp, YYLTYPE *llocp, int *nastiness); +int yyparse (int *nastiness, int *randomness); +void yyerror (YYLTYPE *locp, + int *nastiness, int *randomness, + const char *msg); +@end example + +@noindent +Please, note that the prototypes are only indications of how the code +produced by Bison will use @code{yyerror}; you still have freedom on the +exit value, and even on making @code{yyerror} a variadic function. It +is precisely to enable this that the message is always passed last. + @vindex yynerrs The variable @code{yynerrs} contains the number of syntax errors encountered so far. Normally this variable is global; but if you @@ -4064,34 +4245,39 @@ then it is a local variable which only the actions can access. Here is a table of Bison constructs, variables and macros that are useful in actions. -@table @samp -@item $$ +@deffn {Variable} $$ Acts like a variable that contains the semantic value for the grouping made by the current rule. @xref{Actions}. +@end deffn -@item $@var{n} +@deffn {Variable} $@var{n} Acts like a variable that contains the semantic value for the @var{n}th component of the current rule. @xref{Actions}. +@end deffn -@item $<@var{typealt}>$ +@deffn {Variable} $<@var{typealt}>$ Like @code{$$} but specifies alternative @var{typealt} in the union specified by the @code{%union} declaration. @xref{Action Types, ,Data Types of Values in Actions}. +@end deffn -@item $<@var{typealt}>@var{n} +@deffn {Variable} $<@var{typealt}>@var{n} Like @code{$@var{n}} but specifies alternative @var{typealt} in the union specified by the @code{%union} declaration. @xref{Action Types, ,Data Types of Values in Actions}. +@end deffn -@item YYABORT; +@deffn {Macro} YYABORT; Return immediately from @code{yyparse}, indicating failure. @xref{Parser Function, ,The Parser Function @code{yyparse}}. +@end deffn -@item YYACCEPT; +@deffn {Macro} YYACCEPT; Return immediately from @code{yyparse}, indicating success. @xref{Parser Function, ,The Parser Function @code{yyparse}}. +@end deffn -@item YYBACKUP (@var{token}, @var{value}); +@deffn {Macro} YYBACKUP (@var{token}, @var{value}); @findex YYBACKUP Unshift a token. This macro is allowed only for rules that reduce a single value, and only when there is no look-ahead token. @@ -4106,40 +4292,47 @@ a message @samp{cannot back up} and performs ordinary error recovery. In either case, the rest of the action is not executed. +@end deffn -@item YYEMPTY +@deffn {Macro} YYEMPTY @vindex YYEMPTY Value stored in @code{yychar} when there is no look-ahead token. +@end deffn -@item YYERROR; +@deffn {Macro} YYERROR; @findex YYERROR Cause an immediate syntax error. This statement initiates error recovery just as if the parser itself had detected an error; however, it does not call @code{yyerror}, and does not print any message. If you want to print an error message, call @code{yyerror} explicitly before the @samp{YYERROR;} statement. @xref{Error Recovery}. +@end deffn -@item YYRECOVERING +@deffn {Macro} YYRECOVERING This macro stands for an expression that has the value 1 when the parser is recovering from a syntax error, and 0 the rest of the time. @xref{Error Recovery}. +@end deffn -@item yychar +@deffn {Variable} yychar Variable containing the current look-ahead token. (In a pure parser, this is actually a local variable within @code{yyparse}.) When there is no look-ahead token, the value @code{YYEMPTY} is stored in the variable. @xref{Look-Ahead, ,Look-Ahead Tokens}. +@end deffn -@item yyclearin; +@deffn {Macro} yyclearin; Discard the current look-ahead token. This is useful primarily in error rules. @xref{Error Recovery}. +@end deffn -@item yyerrok; +@deffn {Macro} yyerrok; Resume generating error messages immediately for subsequent syntax errors. This is useful primarily in error rules. @xref{Error Recovery}. +@end deffn -@item @@$ +@deffn {Value} @@$ @findex @@$ Acts like a structure variable containing information on the textual position of the grouping made by the current rule. @xref{Locations, , @@ -4163,14 +4356,15 @@ Tracking Locations}. @c those members. @c The use of this feature makes the parser noticeably slower. +@end deffn -@item @@@var{n} +@deffn {Value} @@@var{n} @findex @@@var{n} Acts like a structure variable containing information on the textual position of the @var{n}th component of the current rule. @xref{Locations, , Tracking Locations}. +@end deffn -@end table @node Algorithm @chapter The Bison Parser Algorithm @@ -4963,7 +5157,7 @@ provided which addresses this issue. @cindex error recovery @cindex recovery from errors -It is not usually acceptable to have a program terminate on a parse +It is not usually acceptable to have a program terminate on a syntax error. For example, a compiler should recover sufficiently to parse the rest of the input file and check it for errors; a calculator should accept another expression. @@ -5005,15 +5199,17 @@ will be tokens to read before the next newline. So the rule is not applicable in the ordinary way. But Bison can force the situation to fit the rule, by discarding part of -the semantic context and part of the input. First it discards states and -objects from the stack until it gets back to a state in which the +the semantic context and part of the input. First it discards states +and objects from the stack until it gets back to a state in which the @code{error} token is acceptable. (This means that the subexpressions -already parsed are discarded, back to the last complete @code{stmnts}.) At -this point the @code{error} token can be shifted. Then, if the old +already parsed are discarded, back to the last complete @code{stmnts}.) +At this point the @code{error} token can be shifted. Then, if the old look-ahead token is not acceptable to be shifted next, the parser reads tokens and discards them until it finds a token which is acceptable. In -this example, Bison reads and discards input until the next newline -so that the fourth rule can apply. +this example, Bison reads and discards input until the next newline so +that the fourth rule can apply. Note that discarded symbols are +possible sources of memory leaks, see @ref{Destructor Decl, , Freeing +Discarded Symbols}, for a means to reclaim this memory. The choice of error rules in the grammar is a choice of strategies for error recovery. A simple and useful strategy is simply to skip the rest of @@ -5064,7 +5260,7 @@ this is unacceptable, then the macro @code{yyclearin} may be used to clear this token. Write the statement @samp{yyclearin;} in the error rule's action. -For example, suppose that on a parse error, an error handling routine is +For example, suppose that on a syntax error, an error handling routine is called that advances the input stream to some point where parsing should once again commence. The next symbol returned by the lexical scanner is probably correct. The previous look-ahead token ought to be discarded @@ -5439,9 +5635,9 @@ state 0 $accept -> . exp $ (rule 0) - NUM shift, and go to state 1 + NUM shift, and go to state 1 - exp go to state 2 + exp go to state 2 @end example This reads as follows: ``state 0 corresponds to being at the very @@ -5451,7 +5647,7 @@ after having reduced a rule that produced an @code{exp}, the control flow jumps to state 2. If there is no such transition on a nonterminal symbol, and the lookahead is a @code{NUM}, then this token is shifted on the parse stack, and the control flow jumps to state 1. Any other -lookahead triggers a parse error.'' +lookahead triggers a syntax error.'' @cindex core, item set @cindex item set core @@ -5488,7 +5684,7 @@ state 1 exp -> NUM . (rule 5) - $default reduce using rule 5 (exp) + $default reduce using rule 5 (exp) @end example @noindent @@ -5506,11 +5702,11 @@ state 2 exp -> exp . '*' exp (rule 3) exp -> exp . '/' exp (rule 4) - $ shift, and go to state 3 - '+' shift, and go to state 4 - '-' shift, and go to state 5 - '*' shift, and go to state 6 - '/' shift, and go to state 7 + $ shift, and go to state 3 + '+' shift, and go to state 4 + '-' shift, and go to state 5 + '*' shift, and go to state 6 + '/' shift, and go to state 7 @end example @noindent @@ -5519,7 +5715,7 @@ because of the item @samp{exp -> exp . '+' exp}, if the lookahead if @samp{+}, it will be shifted on the parse stack, and the automaton control will jump to state 4, corresponding to the item @samp{exp -> exp '+' . exp}. Since there is no default action, any other token than -those listed above will trigger a parse error. +those listed above will trigger a syntax error. The state 3 is named the @dfn{final state}, or the @dfn{accepting state}: @@ -5529,7 +5725,7 @@ state 3 $accept -> exp $ . (rule 0) - $default accept + $default accept @end example @noindent @@ -5544,33 +5740,33 @@ state 4 exp -> exp '+' . exp (rule 1) - NUM shift, and go to state 1 + NUM shift, and go to state 1 - exp go to state 8 + exp go to state 8 state 5 exp -> exp '-' . exp (rule 2) - NUM shift, and go to state 1 + NUM shift, and go to state 1 - exp go to state 9 + exp go to state 9 state 6 exp -> exp '*' . exp (rule 3) - NUM shift, and go to state 1 + NUM shift, and go to state 1 - exp go to state 10 + exp go to state 10 state 7 exp -> exp '/' . exp (rule 4) - NUM shift, and go to state 1 + NUM shift, and go to state 1 - exp go to state 11 + exp go to state 11 @end example As was announced in beginning of the report, @samp{State 8 contains 1 @@ -5585,11 +5781,11 @@ state 8 exp -> exp . '*' exp (rule 3) exp -> exp . '/' exp (rule 4) - '*' shift, and go to state 6 - '/' shift, and go to state 7 + '*' shift, and go to state 6 + '/' shift, and go to state 7 - '/' [reduce using rule 1 (exp)] - $default reduce using rule 1 (exp) + '/' [reduce using rule 1 (exp)] + $default reduce using rule 1 (exp) @end example Indeed, there are two actions associated to the lookahead @samp{/}: @@ -5646,11 +5842,11 @@ state 9 exp -> exp . '*' exp (rule 3) exp -> exp . '/' exp (rule 4) - '*' shift, and go to state 6 - '/' shift, and go to state 7 + '*' shift, and go to state 6 + '/' shift, and go to state 7 - '/' [reduce using rule 2 (exp)] - $default reduce using rule 2 (exp) + '/' [reduce using rule 2 (exp)] + $default reduce using rule 2 (exp) state 10 @@ -5660,10 +5856,10 @@ state 10 exp -> exp '*' exp . (rule 3) exp -> exp . '/' exp (rule 4) - '/' shift, and go to state 7 + '/' shift, and go to state 7 - '/' [reduce using rule 3 (exp)] - $default reduce using rule 3 (exp) + '/' [reduce using rule 3 (exp)] + $default reduce using rule 3 (exp) state 11 @@ -5673,16 +5869,16 @@ state 11 exp -> exp . '/' exp (rule 4) exp -> exp '/' exp . (rule 4) - '+' shift, and go to state 4 - '-' shift, and go to state 5 - '*' shift, and go to state 6 - '/' shift, and go to state 7 + '+' shift, and go to state 4 + '-' shift, and go to state 5 + '*' shift, and go to state 6 + '/' shift, and go to state 7 - '+' [reduce using rule 4 (exp)] - '-' [reduce using rule 4 (exp)] - '*' [reduce using rule 4 (exp)] - '/' [reduce using rule 4 (exp)] - $default reduce using rule 4 (exp) + '+' [reduce using rule 4 (exp)] + '-' [reduce using rule 4 (exp)] + '*' [reduce using rule 4 (exp)] + '/' [reduce using rule 4 (exp)] + $default reduce using rule 4 (exp) @end example @noindent @@ -5785,15 +5981,15 @@ Here is an example of @code{YYPRINT} suitable for the multi-function calculator (@pxref{Mfcalc Decl, ,Declarations for @code{mfcalc}}): @smallexample -#define YYPRINT(file, type, value) yyprint (file, type, value) +#define YYPRINT(file, type, value) print_token_value (file, type, value) static void -yyprint (FILE *file, int type, YYSTYPE value) +print_token_value (FILE *file, int type, YYSTYPE value) @{ if (type == VAR) - fprintf (file, " %s", value.tptr->name); + fprintf (file, "%s", value.tptr->name); else if (type == NUM) - fprintf (file, " %d", value.val); + fprintf (file, "%d", value.val); @} @end smallexample @@ -5841,7 +6037,6 @@ will produce @file{output.c++} and @file{outfile.h++}. * Bison Options:: All the options described in detail, in alphabetical order by short options. * Option Cross Key:: Alphabetical list of long options. -* VMS Invocation:: Bison command syntax on @acronym{VMS}. @end menu @node Bison Options @@ -6033,34 +6228,6 @@ the corresponding short option. @end example @end ifinfo -@node VMS Invocation -@section Invoking Bison under @acronym{VMS} -@cindex invoking Bison under @acronym{VMS} -@cindex @acronym{VMS} - -The command line syntax for Bison on @acronym{VMS} is a variant of the usual -Bison command syntax---adapted to fit @acronym{VMS} conventions. - -To find the @acronym{VMS} equivalent for any Bison option, start with the long -option, and substitute a @samp{/} for the leading @samp{--}, and -substitute a @samp{_} for each @samp{-} in the name of the long option. -For example, the following invocation under @acronym{VMS}: - -@example -bison /debug/name_prefix=bar foo.y -@end example - -@noindent -is equivalent to the following command under @acronym{POSIX}. - -@example -bison --debug --name-prefix=bar foo.y -@end example - -The @acronym{VMS} file system does not permit filenames such as -@file{foo.tab.c}. In the above example, the output file -would instead be named @file{foo_tab.c}. - @c ================================================= Invoking Bison @node FAQ @@ -6093,284 +6260,358 @@ This question is already addressed elsewhere, @xref{Recursion, @cindex Bison symbols, table of @cindex symbols in Bison, table of -@table @code -@item @@$ +@deffn {Variable} @@$ In an action, the location of the left-hand side of the rule. @xref{Locations, , Locations Overview}. +@end deffn -@item @@@var{n} +@deffn {Variable} @@@var{n} In an action, the location of the @var{n}-th symbol of the right-hand side of the rule. @xref{Locations, , Locations Overview}. +@end deffn -@item $$ +@deffn {Variable} $$ In an action, the semantic value of the left-hand side of the rule. @xref{Actions}. +@end deffn -@item $@var{n} +@deffn {Variable} $@var{n} In an action, the semantic value of the @var{n}-th symbol of the right-hand side of the rule. @xref{Actions}. +@end deffn -@item $accept +@deffn {Symbol} $accept The predefined nonterminal whose only rule is @samp{$accept: @var{start} $end}, where @var{start} is the start symbol. @xref{Start Decl, , The Start-Symbol}. It cannot be used in the grammar. +@end deffn -@item $end +@deffn {Symbol} $end The predefined token marking the end of the token stream. It cannot be used in the grammar. +@end deffn -@item $undefined +@deffn {Symbol} $undefined The predefined token onto which all undefined values returned by @code{yylex} are mapped. It cannot be used in the grammar, rather, use @code{error}. +@end deffn -@item error +@deffn {Symbol} error A token name reserved for error recovery. This token may be used in grammar rules so as to allow the Bison parser to recognize an error in the grammar without halting the process. In effect, a sentence -containing an error may be recognized as valid. On a parse error, the +containing an error may be recognized as valid. On a syntax error, the token @code{error} becomes the current look-ahead token. Actions corresponding to @code{error} are then executed, and the look-ahead token is reset to the token that originally caused the violation. @xref{Error Recovery}. +@end deffn -@item YYABORT +@deffn {Macro} YYABORT Macro to pretend that an unrecoverable syntax error has occurred, by making @code{yyparse} return 1 immediately. The error reporting function @code{yyerror} is not called. @xref{Parser Function, ,The Parser Function @code{yyparse}}. +@end deffn -@item YYACCEPT +@deffn {Macro} YYACCEPT Macro to pretend that a complete utterance of the language has been read, by making @code{yyparse} return 0 immediately. @xref{Parser Function, ,The Parser Function @code{yyparse}}. +@end deffn -@item YYBACKUP +@deffn {Macro} YYBACKUP Macro to discard a value from the parser stack and fake a look-ahead token. @xref{Action Features, ,Special Features for Use in Actions}. +@end deffn -@item YYDEBUG +@deffn {Macro} YYDEBUG Macro to define to equip the parser with tracing code. @xref{Tracing, ,Tracing Your Parser}. +@end deffn -@item YYERROR +@deffn {Macro} YYERROR Macro to pretend that a syntax error has just been detected: call @code{yyerror} and then perform normal error recovery if possible (@pxref{Error Recovery}), or (if recovery is impossible) make @code{yyparse} return 1. @xref{Error Recovery}. +@end deffn -@item YYERROR_VERBOSE -Macro that you define with @code{#define} in the Bison declarations -section to request verbose, specific error message strings when -@code{yyerror} is called. +@deffn {Macro} YYERROR_VERBOSE +An obsolete macro that you define with @code{#define} in the Bison +declarations section to request verbose, specific error message strings +when @code{yyerror} is called. It doesn't matter what definition you +use for @code{YYERROR_VERBOSE}, just whether you define it. Using +@code{%error-verbose} is preferred. +@end deffn -@item YYINITDEPTH +@deffn {Macro} YYINITDEPTH Macro for specifying the initial size of the parser stack. @xref{Stack Overflow}. +@end deffn -@item YYLEX_PARAM -Macro for specifying an extra argument (or list of extra arguments) for -@code{yyparse} to pass to @code{yylex}. @xref{Pure Calling,, Calling -Conventions for Pure Parsers}. +@deffn {Macro} YYLEX_PARAM +An obsolete macro for specifying an extra argument (or list of extra +arguments) for @code{yyparse} to pass to @code{yylex}. he use of this +macro is deprecated, and is supported only for Yacc like parsers. +@xref{Pure Calling,, Calling Conventions for Pure Parsers}. +@end deffn -@item YYLTYPE +@deffn {Macro} YYLTYPE Macro for the data type of @code{yylloc}; a structure with four members. @xref{Location Type, , Data Types of Locations}. +@end deffn -@item yyltype +@deffn {Type} yyltype Default value for YYLTYPE. +@end deffn -@item YYMAXDEPTH -Macro for specifying the maximum size of the parser stack. -@xref{Stack Overflow}. +@deffn {Macro} YYMAXDEPTH +Macro for specifying the maximum size of the parser stack. @xref{Stack +Overflow}. +@end deffn -@item YYPARSE_PARAM -Macro for specifying the name of a parameter that @code{yyparse} should -accept. @xref{Pure Calling,, Calling Conventions for Pure Parsers}. +@deffn {Macro} YYPARSE_PARAM +An obsolete macro for specifying the name of a parameter that +@code{yyparse} should accept. The use of this macro is deprecated, and +is supported only for Yacc like parsers. @xref{Pure Calling,, Calling +Conventions for Pure Parsers}. +@end deffn -@item YYRECOVERING +@deffn {Macro} YYRECOVERING Macro whose value indicates whether the parser is recovering from a syntax error. @xref{Action Features, ,Special Features for Use in Actions}. +@end deffn -@item YYSTACK_USE_ALLOCA +@deffn {Macro} YYSTACK_USE_ALLOCA Macro used to control the use of @code{alloca}. If defined to @samp{0}, the parser will not use @code{alloca} but @code{malloc} when trying to grow its internal stacks. Do @emph{not} define @code{YYSTACK_USE_ALLOCA} to anything else. +@end deffn -@item YYSTYPE +@deffn {Macro} YYSTYPE Macro for the data type of semantic values; @code{int} by default. @xref{Value Type, ,Data Types of Semantic Values}. +@end deffn -@item yychar +@deffn {Variable} yychar External integer variable that contains the integer value of the current look-ahead token. (In a pure parser, it is a local variable within @code{yyparse}.) Error-recovery rule actions may examine this variable. @xref{Action Features, ,Special Features for Use in Actions}. +@end deffn -@item yyclearin +@deffn {Variable} yyclearin Macro used in error-recovery rule actions. It clears the previous look-ahead token. @xref{Error Recovery}. +@end deffn -@item yydebug +@deffn {Variable} yydebug External integer variable set to zero by default. If @code{yydebug} is given a nonzero value, the parser will output information on input symbols and parser action. @xref{Tracing, ,Tracing Your Parser}. +@end deffn -@item yyerrok +@deffn {Macro} yyerrok Macro to cause parser to recover immediately to its normal mode -after a parse error. @xref{Error Recovery}. +after a syntax error. @xref{Error Recovery}. +@end deffn -@item yyerror +@deffn {Function} yyerror User-supplied function to be called by @code{yyparse} on error. The function receives one argument, a pointer to a character string containing an error message. @xref{Error Reporting, ,The Error Reporting Function @code{yyerror}}. +@end deffn -@item yylex +@deffn {Function} yylex User-supplied lexical analyzer function, called with no arguments to get the next token. @xref{Lexical, ,The Lexical Analyzer Function @code{yylex}}. +@end deffn -@item yylval +@deffn {Variable} yylval External variable in which @code{yylex} should place the semantic value associated with a token. (In a pure parser, it is a local variable within @code{yyparse}, and its address is passed to @code{yylex}.) @xref{Token Values, ,Semantic Values of Tokens}. +@end deffn -@item yylloc +@deffn {Variable} yylloc External variable in which @code{yylex} should place the line and column numbers associated with a token. (In a pure parser, it is a local variable within @code{yyparse}, and its address is passed to @code{yylex}.) You can ignore this variable if you don't use the @samp{@@} feature in the grammar actions. @xref{Token Positions, ,Textual Positions of Tokens}. +@end deffn -@item yynerrs -Global variable which Bison increments each time there is a parse error. +@deffn {Variable} yynerrs +Global variable which Bison increments each time there is a syntax error. (In a pure parser, it is a local variable within @code{yyparse}.) @xref{Error Reporting, ,The Error Reporting Function @code{yyerror}}. +@end deffn -@item yyparse +@deffn {Function} yyparse The parser function produced by Bison; call this function to start parsing. @xref{Parser Function, ,The Parser Function @code{yyparse}}. +@end deffn -@item %debug +@deffn {Directive} %debug Equip the parser for debugging. @xref{Decl Summary}. +@end deffn -@item %defines +@deffn {Directive} %defines Bison declaration to create a header file meant for the scanner. @xref{Decl Summary}. +@end deffn + +@deffn {Directive} %destructor +Specifying how the parser should reclaim the memory associated to +discarded symbols. @xref{Destructor Decl, , Freeing Discarded Symbols}. +@end deffn -@item %dprec +@deffn {Directive} %dprec Bison declaration to assign a precedence to a rule that is used at parse time to resolve reduce/reduce conflicts. @xref{GLR Parsers, ,Writing @acronym{GLR} Parsers}. +@end deffn -@item %file-prefix="@var{prefix}" +@deffn {Directive} %error-verbose +Bison declaration to request verbose, specific error message strings +when @code{yyerror} is called. +@end deffn + +@deffn {Directive} %file-prefix="@var{prefix}" Bison declaration to set the prefix of the output files. @xref{Decl Summary}. +@end deffn -@item %glr-parser +@deffn {Directive} %glr-parser Bison declaration to produce a @acronym{GLR} parser. @xref{GLR Parsers, ,Writing @acronym{GLR} Parsers}. +@end deffn -@c @item %source-extension -@c Bison declaration to specify the generated parser output file extension. -@c @xref{Decl Summary}. -@c -@c @item %header-extension -@c Bison declaration to specify the generated parser header file extension -@c if required. @xref{Decl Summary}. - -@item %left +@deffn {Directive} %left Bison declaration to assign left associativity to token(s). @xref{Precedence Decl, ,Operator Precedence}. +@end deffn + +@deffn {Directive} %lex-param @{@var{argument-declaration}@} +Bison declaration to specifying an additional parameter that +@code{yylex} should accept. @xref{Pure Calling,, Calling Conventions +for Pure Parsers}. +@end deffn -@item %merge +@deffn {Directive} %merge Bison declaration to assign a merging function to a rule. If there is a reduce/reduce conflict with a rule having the same merging function, the function is applied to the two semantic values to get a single result. @xref{GLR Parsers, ,Writing @acronym{GLR} Parsers}. +@end deffn -@item %name-prefix="@var{prefix}" +@deffn {Directive} %name-prefix="@var{prefix}" Bison declaration to rename the external symbols. @xref{Decl Summary}. +@end deffn -@item %no-lines +@deffn {Directive} %no-lines Bison declaration to avoid generating @code{#line} directives in the parser file. @xref{Decl Summary}. +@end deffn -@item %nonassoc +@deffn {Directive} %nonassoc Bison declaration to assign non-associativity to token(s). @xref{Precedence Decl, ,Operator Precedence}. +@end deffn -@item %output="@var{filename}" +@deffn {Directive} %output="@var{filename}" Bison declaration to set the name of the parser file. @xref{Decl Summary}. +@end deffn -@item %prec +@deffn {Directive} %parse-param @{@var{argument-declaration}@} +Bison declaration to specifying an additional parameter that +@code{yyparse} should accept. @xref{Parser Function,, The Parser +Function @code{yyparse}}. +@end deffn + +@deffn {Directive} %prec Bison declaration to assign a precedence to a specific rule. @xref{Contextual Precedence, ,Context-Dependent Precedence}. +@end deffn -@item %pure-parser +@deffn {Directive} %pure-parser Bison declaration to request a pure (reentrant) parser. @xref{Pure Decl, ,A Pure (Reentrant) Parser}. +@end deffn -@item %right +@deffn {Directive} %right Bison declaration to assign right associativity to token(s). @xref{Precedence Decl, ,Operator Precedence}. +@end deffn -@item %start +@deffn {Directive} %start Bison declaration to specify the start symbol. @xref{Start Decl, ,The Start-Symbol}. +@end deffn -@item %token +@deffn {Directive} %token Bison declaration to declare token(s) without specifying precedence. @xref{Token Decl, ,Token Type Names}. +@end deffn -@item %token-table +@deffn {Directive} %token-table Bison declaration to include a token name table in the parser file. @xref{Decl Summary}. +@end deffn -@item %type +@deffn {Directive} %type Bison declaration to declare nonterminals. @xref{Type Decl, ,Nonterminal Symbols}. +@end deffn -@item %union +@deffn {Directive} %union Bison declaration to specify several possible data types for semantic values. @xref{Union Decl, ,The Collection of Value Types}. -@end table +@end deffn @sp 1 These are the punctuation and delimiters used in Bison input: -@table @samp -@item %% +@deffn {Delimiter} %% Delimiter used to separate the grammar rule section from the Bison declarations section or the epilogue. @xref{Grammar Layout, ,The Overall Layout of a Bison Grammar}. +@end deffn -@item %@{ %@} +@c Don't insert spaces, or check the DVI output. +@deffn {Delimiter} %@{@var{code}%@} All code listed between @samp{%@{} and @samp{%@}} is copied directly to the output file uninterpreted. Such code forms the prologue of the input file. @xref{Grammar Outline, ,Outline of a Bison Grammar}. +@end deffn -@item /*@dots{}*/ +@deffn {Construct} /*@dots{}*/ Comment delimiters, as in C. +@end deffn -@item : +@deffn {Delimiter} : Separates a rule's result from its components. @xref{Rules, ,Syntax of Grammar Rules}. +@end deffn -@item ; +@deffn {Delimiter} ; Terminates a rule. @xref{Rules, ,Syntax of Grammar Rules}. +@end deffn -@item | +@deffn {Delimiter} | Separates alternate rules for the same result nonterminal. @xref{Rules, ,Syntax of Grammar Rules}. -@end table +@end deffn @node Glossary @appendix Glossary @@ -6475,10 +6716,6 @@ A grammar symbol standing for a grammatical construct that can be expressed through rules in terms of smaller constructs; in other words, a construct that is not a token. @xref{Symbols}. -@item Parse error -An error encountered during parsing of an input stream due to invalid -syntax. @xref{Error Recovery}. - @item Parser A function that recognizes valid sentences of a language by analyzing the syntax structure of a set of tokens passed to it from a lexical @@ -6531,6 +6768,10 @@ A data structure where symbol names and associated data are stored during parsing to allow for recognition and use of existing information in repeated uses of a symbol. @xref{Multi-function Calc}. +@item Syntax error +An error encountered during parsing of an input stream due to invalid +syntax. @xref{Error Recovery}. + @item Token A basic, grammatically indivisible unit of a language. The symbol that describes a token in the grammar is a terminal symbol.