X-Git-Url: https://git.saurik.com/bison.git/blobdiff_plain/d6864e191b8f96cac314e60403cddd2218cb7744..6528a9915199761f9c9887d4152e5196a6c13449:/doc/bison.texinfo diff --git a/doc/bison.texinfo b/doc/bison.texinfo index c121dd4e..a5cdf52a 100644 --- a/doc/bison.texinfo +++ b/doc/bison.texinfo @@ -536,7 +536,6 @@ lexicography, not grammar.) Here is a simple C function subdivided into tokens: -@ifinfo @example int /* @r{keyword `int'} */ square (int x) /* @r{identifier, open-paren, keyword `int',} @@ -546,16 +545,6 @@ square (int x) /* @r{identifier, open-paren, keyword `int',} @r{identifier, semicolon} */ @} /* @r{close-brace} */ @end example -@end ifinfo -@ifnotinfo -@example -int /* @r{keyword `int'} */ -square (int x) /* @r{identifier, open-paren, keyword `int', identifier, close-paren} */ -@{ /* @r{open-brace} */ - return x * x; /* @r{keyword `return', identifier, asterisk, identifier, semicolon} */ -@} /* @r{close-brace} */ -@end example -@end ifnotinfo The syntactic groupings of C include the expression, the statement, the declaration, and the function definition. These are represented in the @@ -637,8 +626,7 @@ the statement; the naked semicolon, and the colon, are Bison punctuation used in every rule. @example -stmt: RETURN expr ';' - ; +stmt: RETURN expr ';' ; @end example @noindent @@ -710,8 +698,7 @@ For example, here is a rule that says an expression can be the sum of two subexpressions: @example -expr: expr '+' expr @{ $$ = $1 + $3; @} - ; +expr: expr '+' expr @{ $$ = $1 + $3; @} ; @end example @noindent @@ -890,30 +877,32 @@ parses a vastly simplified form of Pascal type declarations. %% @group -type_decl : TYPE ID '=' type ';' - ; +type_decl: TYPE ID '=' type ';' ; @end group @group -type : '(' id_list ')' - | expr DOTDOT expr - ; +type: + '(' id_list ')' +| expr DOTDOT expr +; @end group @group -id_list : ID - | id_list ',' ID - ; +id_list: + ID +| id_list ',' ID +; @end group @group -expr : '(' expr ')' - | expr '+' expr - | expr '-' expr - | expr '*' expr - | expr '/' expr - | ID - ; +expr: + '(' expr ')' +| expr '+' expr +| expr '-' expr +| expr '*' expr +| expr '/' expr +| ID +; @end group @end example @@ -993,30 +982,35 @@ Let's consider an example, vastly simplified from a C++ grammar. %% -prog : - | prog stmt @{ printf ("\n"); @} - ; +prog: + /* Nothing. */ +| prog stmt @{ printf ("\n"); @} +; -stmt : expr ';' %dprec 1 - | decl %dprec 2 - ; +stmt: + expr ';' %dprec 1 +| decl %dprec 2 +; -expr : ID @{ printf ("%s ", $$); @} - | TYPENAME '(' expr ')' - @{ printf ("%s ", $1); @} - | expr '+' expr @{ printf ("+ "); @} - | expr '=' expr @{ printf ("= "); @} - ; +expr: + ID @{ printf ("%s ", $$); @} +| TYPENAME '(' expr ')' + @{ printf ("%s ", $1); @} +| expr '+' expr @{ printf ("+ "); @} +| expr '=' expr @{ printf ("= "); @} +; -decl : TYPENAME declarator ';' - @{ printf ("%s ", $1); @} - | TYPENAME declarator '=' expr ';' - @{ printf ("%s ", $1); @} - ; +decl: + TYPENAME declarator ';' + @{ printf ("%s ", $1); @} +| TYPENAME declarator '=' expr ';' + @{ printf ("%s ", $1); @} +; -declarator : ID @{ printf ("\"%s\" ", $1); @} - | '(' declarator ')' - ; +declarator: + ID @{ printf ("\"%s\" ", $1); @} +| '(' declarator ')' +; @end example @noindent @@ -1085,9 +1079,10 @@ other. To do so, you could change the declaration of @code{stmt} as follows: @example -stmt : expr ';' %merge - | decl %merge - ; +stmt: + expr ';' %merge +| decl %merge +; @end example @noindent @@ -1199,8 +1194,9 @@ will suffice. Otherwise, we suggest @example %@{ - #if __STDC_VERSION__ < 199901 && ! defined __GNUC__ && ! defined inline - #define inline + #if (__STDC_VERSION__ < 199901 && ! defined __GNUC__ \ + && ! defined inline) + # define inline #endif %@} @end example @@ -1389,11 +1385,11 @@ simple program, all the rest of the program can go here. @cindex simple examples @cindex examples, simple -Now we show and explain three sample programs written using Bison: a +Now we show and explain several sample programs written using Bison: a reverse polish notation calculator, an algebraic (infix) notation -calculator, and a multi-function calculator. All three have been tested -under BSD Unix 4.3; each produces a usable, though limited, interactive -desk-top calculator. +calculator --- later extended to track ``locations'' --- +and a multi-function calculator. All +produce usable, though limited, interactive desk-top calculators. These examples are simple, but Bison grammars for real programming languages are written the same way. You can copy these examples into a @@ -1492,24 +1488,31 @@ type for numeric constants. Here are the grammar rules for the reverse polish notation calculator. @example -input: /* empty */ - | input line +@group +input: + /* empty */ +| input line ; +@end group -line: '\n' - | exp '\n' @{ printf ("\t%.10g\n", $1); @} +@group +line: + '\n' +| exp '\n' @{ printf ("%.10g\n", $1); @} ; +@end group -exp: NUM @{ $$ = $1; @} - | exp exp '+' @{ $$ = $1 + $2; @} - | exp exp '-' @{ $$ = $1 - $2; @} - | exp exp '*' @{ $$ = $1 * $2; @} - | exp exp '/' @{ $$ = $1 / $2; @} - /* Exponentiation */ - | exp exp '^' @{ $$ = pow ($1, $2); @} - /* Unary minus */ - | exp 'n' @{ $$ = -$1; @} +@group +exp: + NUM @{ $$ = $1; @} +| exp exp '+' @{ $$ = $1 + $2; @} +| exp exp '-' @{ $$ = $1 - $2; @} +| exp exp '*' @{ $$ = $1 * $2; @} +| exp exp '/' @{ $$ = $1 / $2; @} +| exp exp '^' @{ $$ = pow ($1, $2); @} /* Exponentiation */ +| exp 'n' @{ $$ = -$1; @} /* Unary minus */ ; +@end group %% @end example @@ -1543,8 +1546,9 @@ rule are referred to as @code{$1}, @code{$2}, and so on. Consider the definition of @code{input}: @example -input: /* empty */ - | input line +input: + /* empty */ +| input line ; @end example @@ -1577,8 +1581,9 @@ input tokens; we will arrange for the latter to happen at end-of-input. Now consider the definition of @code{line}: @example -line: '\n' - | exp '\n' @{ printf ("\t%.10g\n", $1); @} +line: + '\n' +| exp '\n' @{ printf ("%.10g\n", $1); @} ; @end example @@ -1605,21 +1610,22 @@ The second handles an addition-expression, which looks like two expressions followed by a plus-sign. The third handles subtraction, and so on. @example -exp: NUM - | exp exp '+' @{ $$ = $1 + $2; @} - | exp exp '-' @{ $$ = $1 - $2; @} - @dots{} - ; +exp: + NUM +| exp exp '+' @{ $$ = $1 + $2; @} +| exp exp '-' @{ $$ = $1 - $2; @} +@dots{} +; @end example We have used @samp{|} to join all the rules for @code{exp}, but we could equally well have written them separately: @example -exp: NUM ; -exp: exp exp '+' @{ $$ = $1 + $2; @} ; -exp: exp exp '-' @{ $$ = $1 - $2; @} ; - @dots{} +exp: NUM ; +exp: exp exp '+' @{ $$ = $1 + $2; @}; +exp: exp exp '-' @{ $$ = $1 - $2; @}; +@dots{} @end example Most of the rules have actions that compute the value of the expression in @@ -1640,16 +1646,17 @@ not require it. You can add or change white space as much as you wish. For example, this: @example -exp : NUM | exp exp '+' @{$$ = $1 + $2; @} | @dots{} ; +exp: NUM | exp exp '+' @{$$ = $1 + $2; @} | @dots{} ; @end example @noindent means the same thing as this: @example -exp: NUM - | exp exp '+' @{ $$ = $1 + $2; @} - | @dots{} +exp: + NUM +| exp exp '+' @{ $$ = $1 + $2; @} +| @dots{} ; @end example @@ -1712,7 +1719,7 @@ yylex (void) /* Skip white space. */ while ((c = getchar ()) == ' ' || c == '\t') - ; + continue; @end group @group /* Process numbers. */ @@ -1765,7 +1772,9 @@ here is the definition we will use: @example @group #include +@end group +@group /* Called by yyparse on error. */ void yyerror (char const *s) @@ -1871,6 +1880,7 @@ parentheses nested to arbitrary depth. Here is the Bison code for @example /* Infix notation calculator. */ +@group %@{ #define YYSTYPE double #include @@ -1878,32 +1888,44 @@ parentheses nested to arbitrary depth. Here is the Bison code for int yylex (void); void yyerror (char const *); %@} +@end group +@group /* Bison declarations. */ %token NUM %left '-' '+' %left '*' '/' %left NEG /* negation--unary minus */ %right '^' /* exponentiation */ +@end group %% /* The grammar follows. */ -input: /* empty */ - | input line +@group +input: + /* empty */ +| input line ; +@end group -line: '\n' - | exp '\n' @{ printf ("\t%.10g\n", $1); @} +@group +line: + '\n' +| exp '\n' @{ printf ("\t%.10g\n", $1); @} ; +@end group -exp: NUM @{ $$ = $1; @} - | exp '+' exp @{ $$ = $1 + $3; @} - | exp '-' exp @{ $$ = $1 - $3; @} - | exp '*' exp @{ $$ = $1 * $3; @} - | exp '/' exp @{ $$ = $1 / $3; @} - | '-' exp %prec NEG @{ $$ = -$2; @} - | exp '^' exp @{ $$ = pow ($1, $3); @} - | '(' exp ')' @{ $$ = $2; @} +@group +exp: + NUM @{ $$ = $1; @} +| exp '+' exp @{ $$ = $1 + $3; @} +| exp '-' exp @{ $$ = $1 - $3; @} +| exp '*' exp @{ $$ = $1 * $3; @} +| exp '/' exp @{ $$ = $1 / $3; @} +| '-' exp %prec NEG @{ $$ = -$2; @} +| exp '^' exp @{ $$ = pow ($1, $3); @} +| '(' exp ')' @{ $$ = $2; @} ; +@end group %% @end example @@ -1964,9 +1986,10 @@ been added to one of the alternatives for @code{line}: @example @group -line: '\n' - | exp '\n' @{ printf ("\t%.10g\n", $1); @} - | error '\n' @{ yyerrok; @} +line: + '\n' +| exp '\n' @{ printf ("\t%.10g\n", $1); @} +| error '\n' @{ yyerrok; @} ; @end group @end example @@ -2057,41 +2080,44 @@ wrong expressions or subexpressions. @example @group -input : /* empty */ - | input line +input: + /* empty */ +| input line ; @end group @group -line : '\n' - | exp '\n' @{ printf ("%d\n", $1); @} +line: + '\n' +| exp '\n' @{ printf ("%d\n", $1); @} ; @end group @group -exp : NUM @{ $$ = $1; @} - | exp '+' exp @{ $$ = $1 + $3; @} - | exp '-' exp @{ $$ = $1 - $3; @} - | exp '*' exp @{ $$ = $1 * $3; @} +exp: + NUM @{ $$ = $1; @} +| exp '+' exp @{ $$ = $1 + $3; @} +| exp '-' exp @{ $$ = $1 - $3; @} +| exp '*' exp @{ $$ = $1 * $3; @} @end group @group - | exp '/' exp - @{ - if ($3) - $$ = $1 / $3; - else - @{ - $$ = 1; - fprintf (stderr, "%d.%d-%d.%d: division by zero", - @@3.first_line, @@3.first_column, - @@3.last_line, @@3.last_column); - @} - @} +| exp '/' exp + @{ + if ($3) + $$ = $1 / $3; + else + @{ + $$ = 1; + fprintf (stderr, "%d.%d-%d.%d: division by zero", + @@3.first_line, @@3.first_column, + @@3.last_line, @@3.last_column); + @} + @} @end group @group - | '-' exp %prec NEG @{ $$ = -$2; @} - | exp '^' exp @{ $$ = pow ($1, $3); @} - | '(' exp ')' @{ $$ = $2; @} +| '-' exp %prec NEG @{ $$ = -$2; @} +| exp '^' exp @{ $$ = pow ($1, $3); @} +| '(' exp ')' @{ $$ = $2; @} @end group @end example @@ -2158,6 +2184,7 @@ yylex (void) if (c == EOF) return 0; +@group /* Return a single char, and update location. */ if (c == '\n') @{ @@ -2168,6 +2195,7 @@ yylex (void) ++yylloc.last_column; return c; @} +@end group @end example Basically, the lexical analyzer performs the same processing as before: @@ -2253,7 +2281,8 @@ Note that multiple assignment and nested function calls are permitted. Here are the C and Bison declarations for the multi-function calculator. -@smallexample +@comment file: mfcalc.y +@example @group %@{ #include /* For math functions, cos(), sin(), etc. */ @@ -2280,7 +2309,7 @@ Here are the C and Bison declarations for the multi-function calculator. %right '^' /* exponentiation */ @end group %% /* The grammar follows. */ -@end smallexample +@end example The above grammar introduces only two new features of the Bison language. These features allow semantic values to have various data types @@ -2311,38 +2340,41 @@ Here are the grammar rules for the multi-function calculator. Most of them are copied directly from @code{calc}; three rules, those which mention @code{VAR} or @code{FNCT}, are new. -@smallexample +@comment file: mfcalc.y +@example @group -input: /* empty */ - | input line +input: + /* empty */ +| input line ; @end group @group line: - '\n' - | exp '\n' @{ printf ("\t%.10g\n", $1); @} - | error '\n' @{ yyerrok; @} + '\n' +| exp '\n' @{ printf ("%.10g\n", $1); @} +| error '\n' @{ yyerrok; @} ; @end group @group -exp: NUM @{ $$ = $1; @} - | VAR @{ $$ = $1->value.var; @} - | VAR '=' exp @{ $$ = $3; $1->value.var = $3; @} - | FNCT '(' exp ')' @{ $$ = (*($1->value.fnctptr))($3); @} - | exp '+' exp @{ $$ = $1 + $3; @} - | exp '-' exp @{ $$ = $1 - $3; @} - | exp '*' exp @{ $$ = $1 * $3; @} - | exp '/' exp @{ $$ = $1 / $3; @} - | '-' exp %prec NEG @{ $$ = -$2; @} - | exp '^' exp @{ $$ = pow ($1, $3); @} - | '(' exp ')' @{ $$ = $2; @} +exp: + NUM @{ $$ = $1; @} +| VAR @{ $$ = $1->value.var; @} +| VAR '=' exp @{ $$ = $3; $1->value.var = $3; @} +| FNCT '(' exp ')' @{ $$ = (*($1->value.fnctptr))($3); @} +| exp '+' exp @{ $$ = $1 + $3; @} +| exp '-' exp @{ $$ = $1 - $3; @} +| exp '*' exp @{ $$ = $1 * $3; @} +| exp '/' exp @{ $$ = $1 / $3; @} +| '-' exp %prec NEG @{ $$ = -$2; @} +| exp '^' exp @{ $$ = pow ($1, $3); @} +| '(' exp ')' @{ $$ = $2; @} ; @end group /* End of grammar. */ %% -@end smallexample +@end example @node Mfcalc Symbol Table @subsection The @code{mfcalc} Symbol Table @@ -2357,7 +2389,8 @@ The symbol table itself consists of a linked list of records. Its definition, which is kept in the header @file{calc.h}, is as follows. It provides for either functions or variables to be placed in the table. -@smallexample +@comment file: calc.h +@example @group /* Function type. */ typedef double (*func_t) (double); @@ -2387,13 +2420,13 @@ extern symrec *sym_table; symrec *putsym (char const *, int); symrec *getsym (char const *); @end group -@end smallexample +@end example The new version of @code{main} includes a call to @code{init_table}, a function that initializes the symbol table. Here it is, and @code{init_table} as well: -@smallexample +@example #include @group @@ -2437,10 +2470,9 @@ void init_table (void) @{ int i; - symrec *ptr; for (i = 0; arith_fncts[i].fname != 0; i++) @{ - ptr = putsym (arith_fncts[i].fname, FNCT); + symrec *ptr = putsym (arith_fncts[i].fname, FNCT); ptr->value.fnctptr = arith_fncts[i].fnct; @} @} @@ -2454,7 +2486,7 @@ main (void) return yyparse (); @} @end group -@end smallexample +@end example By simply editing the initialization list and adding the necessary include files, you can add additional functions to the calculator. @@ -2466,12 +2498,16 @@ linked to the front of the list, and a pointer to the object is returned. The function @code{getsym} is passed the name of the symbol to look up. If found, a pointer to that symbol is returned; otherwise zero is returned. -@smallexample +@comment file: mfcalc.y +@example +#include /* malloc. */ +#include /* strlen. */ + +@group symrec * putsym (char const *sym_name, int sym_type) @{ - symrec *ptr; - ptr = (symrec *) malloc (sizeof (symrec)); + symrec *ptr = (symrec *) malloc (sizeof (symrec)); ptr->name = (char *) malloc (strlen (sym_name) + 1); strcpy (ptr->name,sym_name); ptr->type = sym_type; @@ -2480,7 +2516,9 @@ putsym (char const *sym_name, int sym_type) sym_table = ptr; return ptr; @} +@end group +@group symrec * getsym (char const *sym_name) @{ @@ -2491,7 +2529,8 @@ getsym (char const *sym_name) return ptr; return 0; @} -@end smallexample +@end group +@end example The function @code{yylex} must now recognize variables, numeric values, and the single-character arithmetic operators. Strings of alphanumeric @@ -2508,7 +2547,8 @@ returned to @code{yyparse}. No change is needed in the handling of numeric values and arithmetic operators in @code{yylex}. -@smallexample +@comment file: mfcalc.y +@example @group #include @end group @@ -2520,7 +2560,8 @@ yylex (void) int c; /* Ignore white space, get first nonwhite character. */ - while ((c = getchar ()) == ' ' || c == '\t'); + while ((c = getchar ()) == ' ' || c == '\t') + continue; if (c == EOF) return 0; @@ -2540,21 +2581,19 @@ yylex (void) /* Char starts an identifier => read the name. */ if (isalpha (c)) @{ - symrec *s; + /* Initially make the buffer long enough + for a 40-character symbol name. */ + static size_t length = 40; static char *symbuf = 0; - static int length = 0; + symrec *s; int i; @end group -@group - /* Initially make the buffer long enough - for a 40-character symbol name. */ - if (length == 0) - length = 40, symbuf = (char *)malloc (length + 1); + if (!symbuf) + symbuf = (char *) malloc (length + 1); i = 0; do -@end group @group @{ /* If buffer is full, make it bigger. */ @@ -2588,7 +2627,7 @@ yylex (void) return c; @} @end group -@end smallexample +@end example This program is both powerful and flexible. You may easily add new functions, and it is a simple job to modify this code to install @@ -2691,7 +2730,7 @@ prototype functions that take arguments of type @code{YYSTYPE}. This can be done with two @var{Prologue} blocks, one before and one after the @code{%union} declaration. -@smallexample +@example %@{ #define _GNU_SOURCE #include @@ -2709,7 +2748,7 @@ can be done with two @var{Prologue} blocks, one before and one after the %@} @dots{} -@end smallexample +@end example When in doubt, it is usually safer to put prologue code before all Bison declarations, rather than after. For example, any definitions @@ -2737,7 +2776,7 @@ location, or it can be one of @code{requires}, @code{provides}, Look again at the example of the previous section: -@smallexample +@example %@{ #define _GNU_SOURCE #include @@ -2755,7 +2794,7 @@ Look again at the example of the previous section: %@} @dots{} -@end smallexample +@end example @noindent Notice that there are two @var{Prologue} sections here, but there's a @@ -2784,7 +2823,7 @@ To avoid this subtle @code{%union} dependency, rewrite the example using a Let's go ahead and add the new @code{YYLTYPE} definition and the @code{trace_token} prototype at the same time: -@smallexample +@example %code top @{ #define _GNU_SOURCE #include @@ -2816,7 +2855,7 @@ Let's go ahead and add the new @code{YYLTYPE} definition and the @} @dots{} -@end smallexample +@end example @noindent In this way, @code{%code top} and the unqualified @code{%code} achieve the same @@ -2840,20 +2879,27 @@ lines are dependency code required by the @code{YYSTYPE} and @code{YYLTYPE} definitions. Thus, they belong in one or more @code{%code requires}: -@smallexample +@example +@group %code top @{ #define _GNU_SOURCE #include @} +@end group +@group %code requires @{ #include "ptypes.h" @} +@end group +@group %union @{ long int n; tree t; /* @r{@code{tree} is defined in @file{ptypes.h}.} */ @} +@end group +@group %code requires @{ #define YYLTYPE YYLTYPE typedef struct YYLTYPE @@ -2865,15 +2911,18 @@ Thus, they belong in one or more @code{%code requires}: char *filename; @} YYLTYPE; @} +@end group +@group %code @{ static void print_token_value (FILE *, int, YYSTYPE); #define YYPRINT(F, N, L) print_token_value (F, N, L) static void trace_token (enum yytokentype token, YYLTYPE loc); @} +@end group @dots{} -@end smallexample +@end example @noindent Now Bison will insert @code{#include "ptypes.h"} and the new @@ -2907,20 +2956,27 @@ this function is not a dependency required by @code{YYSTYPE} or sufficient. Instead, move its prototype from the unqualified @code{%code} to a @code{%code provides}: -@smallexample +@example +@group %code top @{ #define _GNU_SOURCE #include @} +@end group +@group %code requires @{ #include "ptypes.h" @} +@end group +@group %union @{ long int n; tree t; /* @r{@code{tree} is defined in @file{ptypes.h}.} */ @} +@end group +@group %code requires @{ #define YYLTYPE YYLTYPE typedef struct YYLTYPE @@ -2932,18 +2988,23 @@ sufficient. Instead, move its prototype from the unqualified char *filename; @} YYLTYPE; @} +@end group +@group %code provides @{ void trace_token (enum yytokentype token, YYLTYPE loc); @} +@end group +@group %code @{ static void print_token_value (FILE *, int, YYSTYPE); #define YYPRINT(F, N, L) print_token_value (F, N, L) @} +@end group @dots{} -@end smallexample +@end example @noindent Bison will insert the @code{trace_token} prototype into both the @@ -2969,17 +3030,21 @@ organize your grammar file. For example, you may organize semantic-type-related directives by semantic type: -@smallexample +@example +@group %code requires @{ #include "type1.h" @} %union @{ type1 field1; @} %destructor @{ type1_free ($$); @} %printer @{ type1_print ($$); @} +@end group +@group %code requires @{ #include "type2.h" @} %union @{ type2 field2; @} %destructor @{ type2_free ($$); @} %printer @{ type2_print ($$); @} -@end smallexample +@end group +@end example @noindent You could even place each of the above directive groups in the rules section of @@ -3207,8 +3272,7 @@ A Bison grammar rule has the following general form: @example @group -@var{result}: @var{components}@dots{} - ; +@var{result}: @var{components}@dots{}; @end group @end example @@ -3221,8 +3285,7 @@ For example, @example @group -exp: exp '+' exp - ; +exp: exp '+' exp; @end group @end example @@ -3267,10 +3330,11 @@ be joined with the vertical-bar character @samp{|} as follows: @example @group -@var{result}: @var{rule1-components}@dots{} - | @var{rule2-components}@dots{} - @dots{} - ; +@var{result}: + @var{rule1-components}@dots{} +| @var{rule2-components}@dots{} +@dots{} +; @end group @end example @@ -3283,15 +3347,17 @@ comma-separated sequence of zero or more @code{exp} groupings: @example @group -expseq: /* empty */ - | expseq1 - ; +expseq: + /* empty */ +| expseq1 +; @end group @group -expseq1: exp - | expseq1 ',' exp - ; +expseq1: + exp +| expseq1 ',' exp +; @end group @end example @@ -3311,9 +3377,10 @@ comma-separated sequence of one or more expressions: @example @group -expseq1: exp - | expseq1 ',' exp - ; +expseq1: + exp +| expseq1 ',' exp +; @end group @end example @@ -3326,9 +3393,10 @@ the same construct is defined using @dfn{right recursion}: @example @group -expseq1: exp - | exp ',' expseq1 - ; +expseq1: + exp +| exp ',' expseq1 +; @end group @end example @@ -3352,15 +3420,17 @@ For example: @example @group -expr: primary - | primary '+' primary - ; +expr: + primary +| primary '+' primary +; @end group @group -primary: constant - | '(' expr ')' - ; +primary: + constant +| '(' expr ')' +; @end group @end example @@ -3482,9 +3552,9 @@ Here is a typical example: @example @group -exp: @dots{} - | exp '+' exp - @{ $$ = $1 + $3; @} +exp: +@dots{} +| exp '+' exp @{ $$ = $1 + $3; @} @end group @end example @@ -3492,9 +3562,9 @@ Or, in terms of named references: @example @group -exp[result]: @dots{} - | exp[left] '+' exp[right] - @{ $result = $left + $right; @} +exp[result]: +@dots{} +| exp[left] '+' exp[right] @{ $result = $left + $right; @} @end group @end example @@ -3541,15 +3611,16 @@ is a case in which you can use this reliably: @example @group -foo: expr bar '+' expr @{ @dots{} @} - | expr bar '-' expr @{ @dots{} @} - ; +foo: + expr bar '+' expr @{ @dots{} @} +| expr bar '-' expr @{ @dots{} @} +; @end group @group -bar: /* empty */ - @{ previous_expr = $0; @} - ; +bar: + /* empty */ @{ previous_expr = $0; @} +; @end group @end example @@ -3579,9 +3650,9 @@ in the rule. In this example, @example @group -exp: @dots{} - | exp '+' exp - @{ $$ = $1 + $3; @} +exp: + @dots{} +| exp '+' exp @{ $$ = $1 + $3; @} @end group @end example @@ -3648,11 +3719,11 @@ remove it afterward. Here is how it is done: @example @group -stmt: LET '(' var ')' - @{ $$ = push_context (); - declare_variable ($3); @} - stmt @{ $$ = $6; - pop_context ($5); @} +stmt: + LET '(' var ')' + @{ $$ = push_context (); declare_variable ($3); @} + stmt + @{ $$ = $6; pop_context ($5); @} @end group @end example @@ -3694,15 +3765,19 @@ declare a destructor for that symbol: %% -stmt: let stmt - @{ $$ = $2; - pop_context ($1); @} - ; +stmt: + let stmt + @{ + $$ = $2; + pop_context ($1); + @}; -let: LET '(' var ')' - @{ $$ = push_context (); - declare_variable ($3); @} - ; +let: + LET '(' var ')' + @{ + $$ = push_context (); + declare_variable ($3); + @}; @end group @end example @@ -3721,9 +3796,10 @@ declaration or not: @example @group -compound: '@{' declarations statements '@}' - | '@{' statements '@}' - ; +compound: + '@{' declarations statements '@}' +| '@{' statements '@}' +; @end group @end example @@ -3732,12 +3808,13 @@ But when we add a mid-rule action as follows, the rules become nonfunctional: @example @group -compound: @{ prepare_for_local_variables (); @} - '@{' declarations statements '@}' +compound: + @{ prepare_for_local_variables (); @} + '@{' declarations statements '@}' @end group @group - | '@{' statements '@}' - ; +| '@{' statements '@}' +; @end group @end example @@ -3754,11 +3831,12 @@ actions into the two rules, like this: @example @group -compound: @{ prepare_for_local_variables (); @} - '@{' declarations statements '@}' - | @{ prepare_for_local_variables (); @} - '@{' statements '@}' - ; +compound: + @{ prepare_for_local_variables (); @} + '@{' declarations statements '@}' +| @{ prepare_for_local_variables (); @} + '@{' statements '@}' +; @end group @end example @@ -3772,10 +3850,11 @@ does work is to put the action after the open-brace, like this: @example @group -compound: '@{' @{ prepare_for_local_variables (); @} - declarations statements '@}' - | '@{' statements '@}' - ; +compound: + '@{' @{ prepare_for_local_variables (); @} + declarations statements '@}' +| '@{' statements '@}' +; @end group @end example @@ -3788,18 +3867,16 @@ serves as a subroutine: @example @group -subroutine: /* empty */ - @{ prepare_for_local_variables (); @} - ; - +subroutine: + /* empty */ @{ prepare_for_local_variables (); @} +; @end group @group -compound: subroutine - '@{' declarations statements '@}' - | subroutine - '@{' statements '@}' - ; +compound: + subroutine '@{' declarations statements '@}' +| subroutine '@{' statements '@}' +; @end group @end example @@ -3884,24 +3961,25 @@ Here is a basic example using the default data type for locations: @example @group -exp: @dots{} - | exp '/' exp - @{ - @@$.first_column = @@1.first_column; - @@$.first_line = @@1.first_line; - @@$.last_column = @@3.last_column; - @@$.last_line = @@3.last_line; - if ($3) - $$ = $1 / $3; - else - @{ - $$ = 1; - fprintf (stderr, - "Division by zero, l%d,c%d-l%d,c%d", - @@3.first_line, @@3.first_column, - @@3.last_line, @@3.last_column); - @} - @} +exp: + @dots{} +| exp '/' exp + @{ + @@$.first_column = @@1.first_column; + @@$.first_line = @@1.first_line; + @@$.last_column = @@3.last_column; + @@$.last_line = @@3.last_line; + if ($3) + $$ = $1 / $3; + else + @{ + $$ = 1; + fprintf (stderr, + "Division by zero, l%d,c%d-l%d,c%d", + @@3.first_line, @@3.first_column, + @@3.last_line, @@3.last_column); + @} + @} @end group @end example @@ -3915,20 +3993,21 @@ example above simply rewrites this way: @example @group -exp: @dots{} - | exp '/' exp - @{ - if ($3) - $$ = $1 / $3; - else - @{ - $$ = 1; - fprintf (stderr, - "Division by zero, l%d,c%d-l%d,c%d", - @@3.first_line, @@3.first_column, - @@3.last_line, @@3.last_column); - @} - @} +exp: + @dots{} +| exp '/' exp + @{ + if ($3) + $$ = $1 / $3; + else + @{ + $$ = 1; + fprintf (stderr, + "Division by zero, l%d,c%d-l%d,c%d", + @@3.first_line, @@3.first_column, + @@3.last_line, @@3.last_column); + @} + @} @end group @end example @@ -3969,28 +4048,29 @@ parameter is the number of discarded symbols. By default, @code{YYLLOC_DEFAULT} is defined this way: -@smallexample +@example @group -# define YYLLOC_DEFAULT(Current, Rhs, N) \ - do \ - if (N) \ - @{ \ - (Current).first_line = YYRHSLOC(Rhs, 1).first_line; \ - (Current).first_column = YYRHSLOC(Rhs, 1).first_column; \ - (Current).last_line = YYRHSLOC(Rhs, N).last_line; \ - (Current).last_column = YYRHSLOC(Rhs, N).last_column; \ - @} \ - else \ - @{ \ - (Current).first_line = (Current).last_line = \ - YYRHSLOC(Rhs, 0).last_line; \ - (Current).first_column = (Current).last_column = \ - YYRHSLOC(Rhs, 0).last_column; \ - @} \ - while (0) +# define YYLLOC_DEFAULT(Cur, Rhs, N) \ +do \ + if (N) \ + @{ \ + (Cur).first_line = YYRHSLOC(Rhs, 1).first_line; \ + (Cur).first_column = YYRHSLOC(Rhs, 1).first_column; \ + (Cur).last_line = YYRHSLOC(Rhs, N).last_line; \ + (Cur).last_column = YYRHSLOC(Rhs, N).last_column; \ + @} \ + else \ + @{ \ + (Cur).first_line = (Cur).last_line = \ + YYRHSLOC(Rhs, 0).last_line; \ + (Cur).first_column = (Cur).last_column = \ + YYRHSLOC(Rhs, 0).last_column; \ + @} \ +while (0) @end group -@end smallexample +@end example +@noindent where @code{YYRHSLOC (rhs, k)} is the location of the @var{k}th symbol in @var{rhs} when @var{k} is positive, and the location of the symbol just before the reduction when @var{k} and @var{n} are both zero. @@ -4092,7 +4172,7 @@ In references, in order to specify names containing dots and dashes, an explicit bracketed syntax @code{$[name]} and @code{@@[name]} must be used: @example @group -if-stmt: IF '(' expr ')' THEN then.stmt ';' +if-stmt: "if" '(' expr ')' "then" then.stmt ';' @{ $[if-stmt] = new_if_stmt ($expr, $[then.stmt]); @} @end group @end example @@ -4500,7 +4580,7 @@ symbol that has no declared semantic type tag. @noindent For example: -@smallexample +@example %union @{ char *string; @} %token STRING1 %token STRING2 @@ -4515,7 +4595,7 @@ For example: %destructor @{ free ($$); @} <*> %destructor @{ free ($$); printf ("%d", @@$.first_line); @} STRING1 string1 %destructor @{ printf ("Discarding tagless symbol.\n"); @} <> -@end smallexample +@end example @noindent guarantees that, when the parser discards any user-defined symbol that has a @@ -4540,9 +4620,9 @@ reference it in your grammar. However, it may invoke one of them for the end token (token 0) if you redefine it from @code{$end} to, for example, @code{END}: -@smallexample +@example %token END 0 -@end smallexample +@end example @cindex actions in mid-rule @cindex mid-rule actions @@ -5129,6 +5209,7 @@ Unaccepted @var{variable}s produce an error. Some of the accepted @var{variable}s are: @itemize @bullet +@c ================================================== api.pure @item api.pure @findex %define api.pure @@ -5363,12 +5444,12 @@ should usually be more appropriate than @code{%code top}. However, occasionally it is necessary to insert code much nearer the top of the parser implementation file. For example: -@smallexample +@example %code top @{ #define _GNU_SOURCE #include @} -@end smallexample +@end example @item Location(s): Near the top of the parser implementation file. @end itemize @@ -5691,7 +5772,7 @@ assuming that the characters of the token are stored in @code{token_buffer}, and assuming that the token does not contain any characters like @samp{"} that require escaping. -@smallexample +@example for (i = 0; i < YYNTOKENS; i++) @{ if (yytname[i] != 0 @@ -5702,7 +5783,7 @@ for (i = 0; i < YYNTOKENS; i++) && yytname[i][strlen (token_buffer) + 2] == 0) break; @} -@end smallexample +@end example The @code{yytname} table is generated only if you use the @code{%token-table} declaration. @xref{Decl Summary}. @@ -6299,16 +6380,18 @@ factorial operators (@samp{!}), and allow parentheses for grouping. @example @group -expr: term '+' expr - | term - ; +expr: + term '+' expr +| term +; @end group @group -term: '(' expr ')' - | term '!' - | NUMBER - ; +term: + '(' expr ')' +| term '!' +| NUMBER +; @end group @end example @@ -6346,9 +6429,9 @@ statements, with a pair of rules like this: @example @group if_stmt: - IF expr THEN stmt - | IF expr THEN stmt ELSE stmt - ; + IF expr THEN stmt +| IF expr THEN stmt ELSE stmt +; @end group @end example @@ -6415,20 +6498,22 @@ the conflict: %% @end group @group -stmt: expr - | if_stmt - ; +stmt: + expr +| if_stmt +; @end group @group if_stmt: - IF expr THEN stmt - | IF expr THEN stmt ELSE stmt - ; + IF expr THEN stmt +| IF expr THEN stmt ELSE stmt +; @end group -expr: variable - ; +expr: + variable +; @end example @node Precedence @@ -6456,12 +6541,13 @@ input @w{@samp{1 - 2 * 3}} can be parsed in two different ways): @example @group -expr: expr '-' expr - | expr '*' expr - | expr '<' expr - | '(' expr ')' - @dots{} - ; +expr: + expr '-' expr +| expr '*' expr +| expr '<' expr +| '(' expr ')' +@dots{} +; @end group @end example @@ -6615,10 +6701,11 @@ Now the precedence of @code{UMINUS} can be used in specific rules: @example @group -exp: @dots{} - | exp '-' exp - @dots{} - | '-' exp %prec UMINUS +exp: + @dots{} +| exp '-' exp + @dots{} +| '-' exp %prec UMINUS @end group @end example @@ -6683,18 +6770,20 @@ For example, here is an erroneous attempt to define a sequence of zero or more @code{word} groupings. @example -sequence: /* empty */ - @{ printf ("empty sequence\n"); @} - | maybeword - | sequence word - @{ printf ("added word %s\n", $2); @} - ; +@group +sequence: + /* empty */ @{ printf ("empty sequence\n"); @} +| maybeword +| sequence word @{ printf ("added word %s\n", $2); @} +; +@end group -maybeword: /* empty */ - @{ printf ("empty maybeword\n"); @} - | word - @{ printf ("single word %s\n", $1); @} - ; +@group +maybeword: + /* empty */ @{ printf ("empty maybeword\n"); @} +| word @{ printf ("single word %s\n", $1); @} +; +@end group @end example @noindent @@ -6721,28 +6810,30 @@ reduce/reduce conflict must be studied and usually eliminated. Here is the proper way to define @code{sequence}: @example -sequence: /* empty */ - @{ printf ("empty sequence\n"); @} - | sequence word - @{ printf ("added word %s\n", $2); @} - ; +sequence: + /* empty */ @{ printf ("empty sequence\n"); @} +| sequence word @{ printf ("added word %s\n", $2); @} +; @end example Here is another common error that yields a reduce/reduce conflict: @example -sequence: /* empty */ - | sequence words - | sequence redirects - ; +sequence: + /* empty */ +| sequence words +| sequence redirects +; -words: /* empty */ - | words word - ; +words: + /* empty */ +| words word +; -redirects:/* empty */ - | redirects redirect - ; +redirects: + /* empty */ +| redirects redirect +; @end example @noindent @@ -6761,28 +6852,38 @@ Here are two ways to correct these rules. First, to make it a single level of sequence: @example -sequence: /* empty */ - | sequence word - | sequence redirect - ; +sequence: + /* empty */ +| sequence word +| sequence redirect +; @end example Second, to prevent either a @code{words} or a @code{redirects} from being empty: @example -sequence: /* empty */ - | sequence words - | sequence redirects - ; +@group +sequence: + /* empty */ +| sequence words +| sequence redirects +; +@end group -words: word - | words word - ; +@group +words: + word +| words word +; +@end group -redirects:redirect - | redirects redirect - ; +@group +redirects: + redirect +| redirects redirect +; +@end group @end example @node Mysterious Conflicts @@ -6797,30 +6898,27 @@ Here is an example: %token ID %% -def: param_spec return_spec ',' - ; +def: param_spec return_spec ','; param_spec: - type - | name_list ':' type - ; + type +| name_list ':' type +; @end group @group return_spec: - type - | name ':' type - ; + type +| name ':' type +; @end group @group -type: ID - ; +type: ID; @end group @group -name: ID - ; +name: ID; name_list: - name - | name ',' name_list - ; + name +| name ',' name_list +; @end group @end example @@ -6869,11 +6967,10 @@ distinct. In the above example, adding one rule to %% @dots{} return_spec: - type - | name ':' type - /* This rule is never used. */ - | ID BOGUS - ; + type +| name ':' type +| ID BOGUS /* This rule is never used. */ +; @end group @end example @@ -6893,13 +6990,13 @@ rather than the one for @code{name}. @example param_spec: - type - | name_list ':' type - ; + type +| name_list ':' type +; return_spec: - type - | ID ':' type - ; + type +| ID ':' type +; @end example For a more detailed exposition of LALR(1) parsers and parser @@ -7474,10 +7571,11 @@ in the current context, the parse can continue. For example: @example -stmnts: /* empty string */ - | stmnts '\n' - | stmnts exp '\n' - | stmnts error '\n' +stmnts: + /* empty string */ +| stmnts '\n' +| stmnts exp '\n' +| stmnts error '\n' @end example The fourth rule in this example says that an error followed by a newline @@ -7518,10 +7616,11 @@ close-delimiter will probably appear to be unmatched, and generate another, spurious error message: @example -primary: '(' expr ')' - | '(' error ')' - @dots{} - ; +primary: + '(' expr ')' +| '(' error ')' +@dots{} +; @end example Error recovery strategies are necessarily guesses. When they guess wrong, @@ -7622,11 +7721,13 @@ earlier: @example typedef int foo, bar; int baz (void) +@group @{ static bar (bar); /* @r{redeclare @code{bar} as static variable} */ extern foo foo (foo); /* @r{redeclare @code{foo} as function} */ return foo (bar); @} +@end group @end example Unfortunately, the name being declared is separated from the declaration @@ -7639,17 +7740,19 @@ declaration in which that can't be done. Here is a part of the duplication, with actions omitted for brevity: @example +@group initdcl: - declarator maybeasm '=' - init - | declarator maybeasm - ; + declarator maybeasm '=' init +| declarator maybeasm +; +@end group +@group notype_initdcl: - notype_declarator maybeasm '=' - init - | notype_declarator maybeasm - ; + notype_declarator maybeasm '=' init +| notype_declarator maybeasm +; +@end group @end example @noindent @@ -7689,24 +7792,21 @@ as an identifier if it appears in that context. Here is how you can do it: @dots{} @end group @group -expr: IDENTIFIER - | constant - | HEX '(' - @{ hexflag = 1; @} - expr ')' - @{ hexflag = 0; - $$ = $4; @} - | expr '+' expr - @{ $$ = make_sum ($1, $3); @} - @dots{} - ; +expr: + IDENTIFIER +| constant +| HEX '(' @{ hexflag = 1; @} + expr ')' @{ hexflag = 0; $$ = $4; @} +| expr '+' expr @{ $$ = make_sum ($1, $3); @} +@dots{} +; @end group @group constant: - INTEGER - | STRING - ; + INTEGER +| STRING +; @end group @end example @@ -7732,12 +7832,12 @@ For example, in C-like languages, a typical error recovery rule is to skip tokens until the next semicolon, and then start a new statement, like this: @example -stmt: expr ';' - | IF '(' expr ')' stmt @{ @dots{} @} - @dots{} - error ';' - @{ hexflag = 0; @} - ; +stmt: + expr ';' +| IF '(' expr ')' stmt @{ @dots{} @} +@dots{} +| error ';' @{ hexflag = 0; @} +; @end example If there is a syntax error in the middle of a @samp{hex (@var{expr})} @@ -7754,11 +7854,11 @@ and skips to the close-parenthesis: @example @group -expr: @dots{} - | '(' expr ')' - @{ $$ = $2; @} - | '(' error ')' - @dots{} +expr: + @dots{} +| '(' expr ')' @{ $$ = $2; @} +| '(' error ')' +@dots{} @end group @end example @@ -7816,12 +7916,13 @@ The following grammar file, @file{calc.y}, will be used in the sequel: %left '+' '-' %left '*' %% -exp: exp '+' exp - | exp '-' exp - | exp '*' exp - | exp '/' exp - | NUM - ; +exp: + exp '+' exp +| exp '-' exp +| exp '*' exp +| exp '/' exp +| NUM +; useless: STR; %% @end example @@ -7904,6 +8005,7 @@ Grammar and reports the uses of the symbols: @example +@group Terminals, with rules where they appear $end (0) 0 @@ -7913,13 +8015,16 @@ $end (0) 0 '/' (47) 4 error (256) NUM (258) 5 +@end group +@group Nonterminals, with rules where they appear $accept (8) on left: 0 exp (9) on left: 1 2 3 4 5, on right: 0 1 2 3 4 +@end group @end example @noindent @@ -7927,9 +8032,9 @@ exp (9) @cindex pointed rule @cindex rule, pointed Bison then proceeds onto the automaton itself, describing each state -with it set of @dfn{items}, also known as @dfn{pointed rules}. Each -item is a production rule together with a point (marked by @samp{.}) -that the input cursor. +with its set of @dfn{items}, also known as @dfn{pointed rules}. Each +item is a production rule together with a point (@samp{.}) marking +the location of the input cursor. @example state 0 @@ -7946,7 +8051,7 @@ beginning of the parsing, in the initial rule, right before the start symbol (here, @code{exp}). When the parser returns to this state right after having reduced a rule that produced an @code{exp}, the control flow jumps to state 2. If there is no such transition on a nonterminal -symbol, and the lookahead is a @code{NUM}, then this token is shifted on +symbol, and the lookahead is a @code{NUM}, then this token is shifted onto the parse stack, and the control flow jumps to state 1. Any other lookahead triggers a syntax error.'' @@ -7959,8 +8064,7 @@ report lists @code{NUM} as a lookahead token because @code{NUM} can be at the beginning of any rule deriving an @code{exp}. By default Bison reports the so-called @dfn{core} or @dfn{kernel} of the item set, but if you want to see more detail you can invoke @command{bison} with -@option{--report=itemset} to list all the items, include those that can -be derived: +@option{--report=itemset} to list the derived items as well: @example state 0 @@ -8012,11 +8116,11 @@ state 2 @noindent In state 2, the automaton can only shift a symbol. For instance, -because of the item @samp{exp -> exp . '+' exp}, if the lookahead if -@samp{+}, it will be shifted on the parse stack, and the automaton -control will jump to state 4, corresponding to the item @samp{exp -> exp -'+' . exp}. Since there is no default action, any other token than -those listed above will trigger a syntax error. +because of the item @samp{exp -> exp . '+' exp}, if the lookahead is +@samp{+} it is shifted onto the parse stack, and the automaton +jumps to state 4, corresponding to the item @samp{exp -> exp '+' . exp}. +Since there is no default action, any lookahead not listed triggers a syntax +error. @cindex accepting state The state 3 is named the @dfn{final state}, or the @dfn{accepting @@ -8136,6 +8240,7 @@ state 8 The remaining states are similar: @example +@group state 9 exp -> exp . '+' exp (rule 1) @@ -8149,7 +8254,9 @@ state 9 '/' [reduce using rule 2 (exp)] $default reduce using rule 2 (exp) +@end group +@group state 10 exp -> exp . '+' exp (rule 1) @@ -8162,7 +8269,9 @@ state 10 '/' [reduce using rule 3 (exp)] $default reduce using rule 3 (exp) +@end group +@group state 11 exp -> exp . '+' exp (rule 1) @@ -8181,6 +8290,7 @@ state 11 '*' [reduce using rule 4 (exp)] '/' [reduce using rule 4 (exp)] $default reduce using rule 4 (exp) +@end group @end example @noindent @@ -8283,10 +8393,11 @@ value (from @code{yylval}). Here is an example of @code{YYPRINT} suitable for the multi-function calculator (@pxref{Mfcalc Declarations, ,Declarations for @code{mfcalc}}): -@smallexample +@example %@{ static void print_token_value (FILE *, int, YYSTYPE); - #define YYPRINT(file, type, value) print_token_value (file, type, value) + #define YYPRINT(file, type, value) \ + print_token_value (file, type, value) %@} @dots{} %% @dots{} %% @dots{} @@ -8299,7 +8410,7 @@ print_token_value (FILE *file, int type, YYSTYPE value) else if (type == NUM) fprintf (file, "%d", value.val); @} -@end smallexample +@end example @c ================================================= Invoking Bison @@ -8425,7 +8536,7 @@ Also warn about mid-rule values that are used but not set. For example, warn about unset @code{$$} in the mid-rule action in: @example - exp: '1' @{ $1 = 1; @} '+' exp @{ $$ = $2 + $4; @}; +exp: '1' @{ $1 = 1; @} '+' exp @{ $$ = $2 + $4; @}; @end example These warnings are not enabled by default since they sometimes prove to @@ -8845,9 +8956,9 @@ The types for semantics value and locations. @end defcv @defcv {Type} {parser} {token} -A structure that contains (only) the definition of the tokens as the -@code{yytokentype} enumeration. To refer to the token @code{FOO}, the -scanner should use @code{yy::parser::token::FOO}. The scanner can use +A structure that contains (only) the @code{yytokentype} enumeration, which +defines the tokens. To refer to the token @code{FOO}, +use @code{yy::parser::token::FOO}. The scanner can use @samp{typedef yy::parser::token token;} to ``import'' the token enumeration (@pxref{Calc++ Scanner}). @end defcv @@ -9092,7 +9203,7 @@ the grammar for. @comment file: calc++-parser.yy @example -%skeleton "lalr1.cc" /* -*- C++ -*- */ +%skeleton "lalr1.cc" /* -*- C++ -*- */ %require "@value{VERSION}" %defines %define parser_class_name "calcxx_parser" @@ -9220,8 +9331,9 @@ The grammar itself is straightforward. %start unit; unit: assignments exp @{ driver.result = $2; @}; -assignments: assignments assignment @{@} - | /* Nothing. */ @{@}; +assignments: + /* Nothing. */ @{@} +| assignments assignment @{@}; assignment: "identifier" ":=" exp @@ -9260,7 +9372,7 @@ parser's to get the set of defined tokens. @comment file: calc++-scanner.ll @example -%@{ /* -*- C++ -*- */ +%@{ /* -*- C++ -*- */ # include # include # include @@ -9314,9 +9426,11 @@ preceding tokens. Comments would be treated equally. @comment file: calc++-scanner.ll @example +@group %@{ # define YY_USER_ACTION yylloc->columns (yyleng); %@} +@end group %% %@{ yylloc->step (); @@ -9358,6 +9472,7 @@ on the scanner's data, it is simpler to implement them in this file. @comment file: calc++-scanner.ll @example +@group void calcxx_driver::scan_begin () @{ @@ -9366,16 +9481,19 @@ calcxx_driver::scan_begin () yyin = stdin; else if (!(yyin = fopen (file.c_str (), "r"))) @{ - error (std::string ("cannot open ") + file); - exit (1); + error ("cannot open " + file + ": " + strerror(errno)); + exit (EXIT_FAILURE); @} @} +@end group +@group void calcxx_driver::scan_end () @{ fclose (yyin); @} +@end group @end example @node Calc++ Top Level @@ -9388,6 +9506,7 @@ The top level file, @file{calc++.cc}, poses no problem. #include #include "calc++-driver.hh" +@group int main (int argc, char *argv[]) @{ @@ -9400,6 +9519,7 @@ main (int argc, char *argv[]) else if (!driver.parse (*argv)) std::cout << driver.result << std::endl; @} +@end group @end example @node Java Parsers @@ -9997,10 +10117,10 @@ are addressed. @node Memory Exhausted @section Memory Exhausted -@display +@quotation My parser returns with error with a @samp{memory exhausted} message. What can I do? -@end display +@end quotation This question is already addressed elsewhere, @xref{Recursion, ,Recursive Rules}. @@ -10011,20 +10131,20 @@ This question is already addressed elsewhere, @xref{Recursion, The following phenomenon has several symptoms, resulting in the following typical questions: -@display +@quotation I invoke @code{yyparse} several times, and on correct input it works properly; but when a parse error is found, all the other calls fail too. How can I reset the error flag of @code{yyparse}? -@end display +@end quotation @noindent or -@display +@quotation My parser includes support for an @samp{#include}-like feature, in which case I run @code{yyparse} from @code{yyparse}. This fails -although I did specify @code{%define api.pure}. -@end display +although I did specify @samp{%define api.pure}. +@end quotation These problems typically come not from Bison itself, but from Lex-generated scanners. Because these scanners use large buffers for @@ -10032,43 +10152,57 @@ speed, they might not notice a change of input file. As a demonstration, consider the following source file, @file{first-line.l}: -@verbatim -%{ +@example +@group +%@{ #include #include -%} +%@} +@end group %% .*\n ECHO; return 1; %% +@group int yyparse (char const *file) -{ +@{ yyin = fopen (file, "r"); if (!yyin) - exit (2); + @{ + perror ("fopen"); + exit (EXIT_FAILURE); + @} +@end group +@group /* One token only. */ yylex (); if (fclose (yyin) != 0) - exit (3); + @{ + perror ("fclose"); + exit (EXIT_FAILURE); + @} return 0; -} +@} +@end group +@group int main (void) -{ +@{ yyparse ("input"); yyparse ("input"); return 0; -} -@end verbatim +@} +@end group +@end example @noindent If the file @file{input} contains -@verbatim +@example input:1: Hello, input:2: World! -@end verbatim +@end example @noindent then instead of getting the first line twice, you get: @@ -10099,35 +10233,41 @@ start condition, through a call to @samp{BEGIN (0)}. @node Strings are Destroyed @section Strings are Destroyed -@display +@quotation My parser seems to destroy old strings, or maybe it loses track of them. Instead of reporting @samp{"foo", "bar"}, it reports @samp{"bar", "bar"}, or even @samp{"foo\nbar", "bar"}. -@end display +@end quotation This error is probably the single most frequent ``bug report'' sent to Bison lists, but is only concerned with a misunderstanding of the role of the scanner. Consider the following Lex code: -@verbatim -%{ +@example +@group +%@{ #include char *yylval = NULL; -%} +%@} +@end group +@group %% .* yylval = yytext; return 1; \n /* IGNORE */ %% +@end group +@group int main () -{ +@{ /* Similar to using $1, $2 in a Bison action. */ char *fst = (yylex (), yylval); char *snd = (yylex (), yylval); printf ("\"%s\", \"%s\"\n", fst, snd); return 0; -} -@end verbatim +@} +@end group +@end example If you compile and run this code, you get: @@ -10158,10 +10298,10 @@ $ @kbd{printf 'one\ntwo\n' | ./split-lines} @node Implementing Gotos/Loops @section Implementing Gotos/Loops -@display +@quotation My simple calculator supports variables, assignments, and functions, but how can I implement gotos, or loops? -@end display +@end quotation Although very pedagogical, the examples included in the document blur the distinction to make between the parser---whose job is to recover @@ -10188,11 +10328,11 @@ invited to consult the dedicated literature. @node Multiple start-symbols @section Multiple start-symbols -@display +@quotation I have several closely related grammars, and I would like to share their implementations. In fact, I could use a single grammar but with multiple entry points. -@end display +@end quotation Bison does not support multiple start-symbols, but there is a very simple means to simulate them. If @code{foo} and @code{bar} are the two @@ -10203,8 +10343,9 @@ real start-symbol: @example %token START_FOO START_BAR; %start start; -start: START_FOO foo - | START_BAR bar; +start: + START_FOO foo +| START_BAR bar; @end example These tokens prevents the introduction of new conflicts. As far as the @@ -10237,9 +10378,9 @@ available in the scanner (e.g., a global variable or using @node Secure? Conform? @section Secure? Conform? -@display +@quotation Is Bison secure? Does it conform to POSIX? -@end display +@end quotation If you're looking for a guarantee or certification, we don't provide it. However, Bison is intended to be a reliable program that conforms to the @@ -10249,11 +10390,11 @@ please send us a bug report. @node I can't build Bison @section I can't build Bison -@display +@quotation I can't build Bison because @command{make} complains that @code{msgfmt} is not found. What should I do? -@end display +@end quotation Like most GNU packages with internationalization support, that feature is turned on by default. If you have problems building in the @file{po} @@ -10267,9 +10408,9 @@ Bison. See the file @file{ABOUT-NLS} for more information. @node Where can I find help? @section Where can I find help? -@display +@quotation I'm having trouble using Bison. Where can I find help? -@end display +@end quotation First, read this fine manual. Beyond that, you can send mail to @email{help-bison@@gnu.org}. This mailing list is intended to be @@ -10284,9 +10425,9 @@ hearts. @node Bug Reports @section Bug Reports -@display +@quotation I found a bug. What should I include in the bug report? -@end display +@end quotation Before you send a bug report, make sure you are using the latest version. Check @url{ftp://ftp.gnu.org/pub/gnu/bison/} or one of its @@ -10315,10 +10456,10 @@ Send bug reports to @email{bug-bison@@gnu.org}. @node More Languages @section More Languages -@display +@quotation Will Bison ever have C++ and Java support? How about @var{insert your favorite language here}? -@end display +@end quotation C++ and Java support is there now, and is documented. We'd love to add other languages; contributions are welcome. @@ -10326,9 +10467,9 @@ languages; contributions are welcome. @node Beta Testing @section Beta Testing -@display +@quotation What is involved in being a beta tester? -@end display +@end quotation It's not terribly involved. Basically, you would download a test release, compile it, and use it to build and run a parser or two. After @@ -10346,9 +10487,9 @@ systems are especially welcome. @node Mailing Lists @section Mailing Lists -@display +@quotation How do I join the help-bison and bug-bison mailing lists? -@end display +@end quotation See @url{http://lists.gnu.org/}. @@ -11138,7 +11279,7 @@ London, Department of Computer Science, TR-00-12 (December 2000). @c LocalWords: NUM exp subsubsection kbd Ctrl ctype EOF getchar isdigit nonfree @c LocalWords: ungetc stdin scanf sc calc ulator ls lm cc NEG prec yyerrok rr @c LocalWords: longjmp fprintf stderr yylloc YYLTYPE cos ln Stallman Destructor -@c LocalWords: smallexample symrec val tptr FNCT fnctptr func struct sym enum +@c LocalWords: symrec val tptr FNCT fnctptr func struct sym enum @c LocalWords: fnct putsym getsym fname arith fncts atan ptr malloc sizeof Lex @c LocalWords: strlen strcpy fctn strcmp isalpha symbuf realloc isalnum DOTDOT @c LocalWords: ptypes itype YYPRINT trigraphs yytname expseq vindex dtype Unary