Here is a simple C function subdivided into tokens:
-@ifinfo
@example
int /* @r{keyword `int'} */
square (int x) /* @r{identifier, open-paren, keyword `int',}
@r{identifier, semicolon} */
@} /* @r{close-brace} */
@end example
-@end ifinfo
-@ifnotinfo
-@example
-int /* @r{keyword `int'} */
-square (int x) /* @r{identifier, open-paren, keyword `int', identifier, close-paren} */
-@{ /* @r{open-brace} */
- return x * x; /* @r{keyword `return', identifier, asterisk, identifier, semicolon} */
-@} /* @r{close-brace} */
-@end example
-@end ifnotinfo
The syntactic groupings of C include the expression, the statement, the
declaration, and the function definition. These are represented in the
used in every rule.
@example
-stmt: RETURN expr ';'
- ;
+stmt: RETURN expr ';' ;
@end example
@noindent
two subexpressions:
@example
-expr: expr '+' expr @{ $$ = $1 + $3; @}
- ;
+expr: expr '+' expr @{ $$ = $1 + $3; @} ;
@end example
@noindent
%%
@group
-type_decl : TYPE ID '=' type ';'
- ;
+type_decl: TYPE ID '=' type ';' ;
@end group
@group
-type : '(' id_list ')'
- | expr DOTDOT expr
- ;
+type:
+ '(' id_list ')'
+| expr DOTDOT expr
+;
@end group
@group
-id_list : ID
- | id_list ',' ID
- ;
+id_list:
+ ID
+| id_list ',' ID
+;
@end group
@group
-expr : '(' expr ')'
- | expr '+' expr
- | expr '-' expr
- | expr '*' expr
- | expr '/' expr
- | ID
- ;
+expr:
+ '(' expr ')'
+| expr '+' expr
+| expr '-' expr
+| expr '*' expr
+| expr '/' expr
+| ID
+;
@end group
@end example
%%
-prog :
- | prog stmt @{ printf ("\n"); @}
- ;
+prog:
+ /* Nothing. */
+| prog stmt @{ printf ("\n"); @}
+;
-stmt : expr ';' %dprec 1
- | decl %dprec 2
- ;
+stmt:
+ expr ';' %dprec 1
+| decl %dprec 2
+;
-expr : ID @{ printf ("%s ", $$); @}
- | TYPENAME '(' expr ')'
- @{ printf ("%s <cast> ", $1); @}
- | expr '+' expr @{ printf ("+ "); @}
- | expr '=' expr @{ printf ("= "); @}
- ;
+expr:
+ ID @{ printf ("%s ", $$); @}
+| TYPENAME '(' expr ')'
+ @{ printf ("%s <cast> ", $1); @}
+| expr '+' expr @{ printf ("+ "); @}
+| expr '=' expr @{ printf ("= "); @}
+;
-decl : TYPENAME declarator ';'
- @{ printf ("%s <declare> ", $1); @}
- | TYPENAME declarator '=' expr ';'
- @{ printf ("%s <init-declare> ", $1); @}
- ;
+decl:
+ TYPENAME declarator ';'
+ @{ printf ("%s <declare> ", $1); @}
+| TYPENAME declarator '=' expr ';'
+ @{ printf ("%s <init-declare> ", $1); @}
+;
-declarator : ID @{ printf ("\"%s\" ", $1); @}
- | '(' declarator ')'
- ;
+declarator:
+ ID @{ printf ("\"%s\" ", $1); @}
+| '(' declarator ')'
+;
@end example
@noindent
follows:
@example
-stmt : expr ';' %merge <stmtMerge>
- | decl %merge <stmtMerge>
- ;
+stmt:
+ expr ';' %merge <stmtMerge>
+| decl %merge <stmtMerge>
+;
@end example
@noindent
@example
%@{
- #if __STDC_VERSION__ < 199901 && ! defined __GNUC__ && ! defined inline
- #define inline
+ #if (__STDC_VERSION__ < 199901 && ! defined __GNUC__ \
+ && ! defined inline)
+ # define inline
#endif
%@}
@end example
@cindex simple examples
@cindex examples, simple
-Now we show and explain three sample programs written using Bison: a
+Now we show and explain several sample programs written using Bison: a
reverse polish notation calculator, an algebraic (infix) notation
-calculator, and a multi-function calculator. All three have been tested
-under BSD Unix 4.3; each produces a usable, though limited, interactive
-desk-top calculator.
+calculator --- later extended to track ``locations'' ---
+and a multi-function calculator. All
+produce usable, though limited, interactive desk-top calculators.
These examples are simple, but Bison grammars for real programming
languages are written the same way. You can copy these examples into a
Here are the grammar rules for the reverse polish notation calculator.
@example
-input: /* empty */
- | input line
+@group
+input:
+ /* empty */
+| input line
;
+@end group
-line: '\n'
- | exp '\n' @{ printf ("\t%.10g\n", $1); @}
+@group
+line:
+ '\n'
+| exp '\n' @{ printf ("%.10g\n", $1); @}
;
+@end group
-exp: NUM @{ $$ = $1; @}
- | exp exp '+' @{ $$ = $1 + $2; @}
- | exp exp '-' @{ $$ = $1 - $2; @}
- | exp exp '*' @{ $$ = $1 * $2; @}
- | exp exp '/' @{ $$ = $1 / $2; @}
- /* Exponentiation */
- | exp exp '^' @{ $$ = pow ($1, $2); @}
- /* Unary minus */
- | exp 'n' @{ $$ = -$1; @}
+@group
+exp:
+ NUM @{ $$ = $1; @}
+| exp exp '+' @{ $$ = $1 + $2; @}
+| exp exp '-' @{ $$ = $1 - $2; @}
+| exp exp '*' @{ $$ = $1 * $2; @}
+| exp exp '/' @{ $$ = $1 / $2; @}
+| exp exp '^' @{ $$ = pow ($1, $2); @} /* Exponentiation */
+| exp 'n' @{ $$ = -$1; @} /* Unary minus */
;
+@end group
%%
@end example
Consider the definition of @code{input}:
@example
-input: /* empty */
- | input line
+input:
+ /* empty */
+| input line
;
@end example
Now consider the definition of @code{line}:
@example
-line: '\n'
- | exp '\n' @{ printf ("\t%.10g\n", $1); @}
+line:
+ '\n'
+| exp '\n' @{ printf ("%.10g\n", $1); @}
;
@end example
followed by a plus-sign. The third handles subtraction, and so on.
@example
-exp: NUM
- | exp exp '+' @{ $$ = $1 + $2; @}
- | exp exp '-' @{ $$ = $1 - $2; @}
- @dots{}
- ;
+exp:
+ NUM
+| exp exp '+' @{ $$ = $1 + $2; @}
+| exp exp '-' @{ $$ = $1 - $2; @}
+@dots{}
+;
@end example
We have used @samp{|} to join all the rules for @code{exp}, but we could
equally well have written them separately:
@example
-exp: NUM ;
-exp: exp exp '+' @{ $$ = $1 + $2; @} ;
-exp: exp exp '-' @{ $$ = $1 - $2; @} ;
- @dots{}
+exp: NUM ;
+exp: exp exp '+' @{ $$ = $1 + $2; @};
+exp: exp exp '-' @{ $$ = $1 - $2; @};
+@dots{}
@end example
Most of the rules have actions that compute the value of the expression in
For example, this:
@example
-exp : NUM | exp exp '+' @{$$ = $1 + $2; @} | @dots{} ;
+exp: NUM | exp exp '+' @{$$ = $1 + $2; @} | @dots{} ;
@end example
@noindent
means the same thing as this:
@example
-exp: NUM
- | exp exp '+' @{ $$ = $1 + $2; @}
- | @dots{}
+exp:
+ NUM
+| exp exp '+' @{ $$ = $1 + $2; @}
+| @dots{}
;
@end example
/* Skip white space. */
while ((c = getchar ()) == ' ' || c == '\t')
- ;
+ continue;
@end group
@group
/* Process numbers. */
@example
@group
#include <stdio.h>
+@end group
+@group
/* Called by yyparse on error. */
void
yyerror (char const *s)
@example
/* Infix notation calculator. */
+@group
%@{
#define YYSTYPE double
#include <math.h>
int yylex (void);
void yyerror (char const *);
%@}
+@end group
+@group
/* Bison declarations. */
%token NUM
%left '-' '+'
%left '*' '/'
%left NEG /* negation--unary minus */
%right '^' /* exponentiation */
+@end group
%% /* The grammar follows. */
-input: /* empty */
- | input line
+@group
+input:
+ /* empty */
+| input line
;
+@end group
-line: '\n'
- | exp '\n' @{ printf ("\t%.10g\n", $1); @}
+@group
+line:
+ '\n'
+| exp '\n' @{ printf ("\t%.10g\n", $1); @}
;
+@end group
-exp: NUM @{ $$ = $1; @}
- | exp '+' exp @{ $$ = $1 + $3; @}
- | exp '-' exp @{ $$ = $1 - $3; @}
- | exp '*' exp @{ $$ = $1 * $3; @}
- | exp '/' exp @{ $$ = $1 / $3; @}
- | '-' exp %prec NEG @{ $$ = -$2; @}
- | exp '^' exp @{ $$ = pow ($1, $3); @}
- | '(' exp ')' @{ $$ = $2; @}
+@group
+exp:
+ NUM @{ $$ = $1; @}
+| exp '+' exp @{ $$ = $1 + $3; @}
+| exp '-' exp @{ $$ = $1 - $3; @}
+| exp '*' exp @{ $$ = $1 * $3; @}
+| exp '/' exp @{ $$ = $1 / $3; @}
+| '-' exp %prec NEG @{ $$ = -$2; @}
+| exp '^' exp @{ $$ = pow ($1, $3); @}
+| '(' exp ')' @{ $$ = $2; @}
;
+@end group
%%
@end example
@example
@group
-line: '\n'
- | exp '\n' @{ printf ("\t%.10g\n", $1); @}
- | error '\n' @{ yyerrok; @}
+line:
+ '\n'
+| exp '\n' @{ printf ("\t%.10g\n", $1); @}
+| error '\n' @{ yyerrok; @}
;
@end group
@end example
@example
@group
-input : /* empty */
- | input line
+input:
+ /* empty */
+| input line
;
@end group
@group
-line : '\n'
- | exp '\n' @{ printf ("%d\n", $1); @}
+line:
+ '\n'
+| exp '\n' @{ printf ("%d\n", $1); @}
;
@end group
@group
-exp : NUM @{ $$ = $1; @}
- | exp '+' exp @{ $$ = $1 + $3; @}
- | exp '-' exp @{ $$ = $1 - $3; @}
- | exp '*' exp @{ $$ = $1 * $3; @}
+exp:
+ NUM @{ $$ = $1; @}
+| exp '+' exp @{ $$ = $1 + $3; @}
+| exp '-' exp @{ $$ = $1 - $3; @}
+| exp '*' exp @{ $$ = $1 * $3; @}
@end group
@group
- | exp '/' exp
- @{
- if ($3)
- $$ = $1 / $3;
- else
- @{
- $$ = 1;
- fprintf (stderr, "%d.%d-%d.%d: division by zero",
- @@3.first_line, @@3.first_column,
- @@3.last_line, @@3.last_column);
- @}
- @}
+| exp '/' exp
+ @{
+ if ($3)
+ $$ = $1 / $3;
+ else
+ @{
+ $$ = 1;
+ fprintf (stderr, "%d.%d-%d.%d: division by zero",
+ @@3.first_line, @@3.first_column,
+ @@3.last_line, @@3.last_column);
+ @}
+ @}
@end group
@group
- | '-' exp %prec NEG @{ $$ = -$2; @}
- | exp '^' exp @{ $$ = pow ($1, $3); @}
- | '(' exp ')' @{ $$ = $2; @}
+| '-' exp %prec NEG @{ $$ = -$2; @}
+| exp '^' exp @{ $$ = pow ($1, $3); @}
+| '(' exp ')' @{ $$ = $2; @}
@end group
@end example
if (c == EOF)
return 0;
+@group
/* Return a single char, and update location. */
if (c == '\n')
@{
++yylloc.last_column;
return c;
@}
+@end group
@end example
Basically, the lexical analyzer performs the same processing as before:
Here are the C and Bison declarations for the multi-function calculator.
-@smallexample
+@comment file: mfcalc.y
+@example
@group
%@{
#include <math.h> /* For math functions, cos(), sin(), etc. */
%right '^' /* exponentiation */
@end group
%% /* The grammar follows. */
-@end smallexample
+@end example
The above grammar introduces only two new features of the Bison language.
These features allow semantic values to have various data types
Most of them are copied directly from @code{calc}; three rules,
those which mention @code{VAR} or @code{FNCT}, are new.
-@smallexample
+@comment file: mfcalc.y
+@example
@group
-input: /* empty */
- | input line
+input:
+ /* empty */
+| input line
;
@end group
@group
line:
- '\n'
- | exp '\n' @{ printf ("\t%.10g\n", $1); @}
- | error '\n' @{ yyerrok; @}
+ '\n'
+| exp '\n' @{ printf ("%.10g\n", $1); @}
+| error '\n' @{ yyerrok; @}
;
@end group
@group
-exp: NUM @{ $$ = $1; @}
- | VAR @{ $$ = $1->value.var; @}
- | VAR '=' exp @{ $$ = $3; $1->value.var = $3; @}
- | FNCT '(' exp ')' @{ $$ = (*($1->value.fnctptr))($3); @}
- | exp '+' exp @{ $$ = $1 + $3; @}
- | exp '-' exp @{ $$ = $1 - $3; @}
- | exp '*' exp @{ $$ = $1 * $3; @}
- | exp '/' exp @{ $$ = $1 / $3; @}
- | '-' exp %prec NEG @{ $$ = -$2; @}
- | exp '^' exp @{ $$ = pow ($1, $3); @}
- | '(' exp ')' @{ $$ = $2; @}
+exp:
+ NUM @{ $$ = $1; @}
+| VAR @{ $$ = $1->value.var; @}
+| VAR '=' exp @{ $$ = $3; $1->value.var = $3; @}
+| FNCT '(' exp ')' @{ $$ = (*($1->value.fnctptr))($3); @}
+| exp '+' exp @{ $$ = $1 + $3; @}
+| exp '-' exp @{ $$ = $1 - $3; @}
+| exp '*' exp @{ $$ = $1 * $3; @}
+| exp '/' exp @{ $$ = $1 / $3; @}
+| '-' exp %prec NEG @{ $$ = -$2; @}
+| exp '^' exp @{ $$ = pow ($1, $3); @}
+| '(' exp ')' @{ $$ = $2; @}
;
@end group
/* End of grammar. */
%%
-@end smallexample
+@end example
@node Mfcalc Symbol Table
@subsection The @code{mfcalc} Symbol Table
definition, which is kept in the header @file{calc.h}, is as follows. It
provides for either functions or variables to be placed in the table.
-@smallexample
+@comment file: calc.h
+@example
@group
/* Function type. */
typedef double (*func_t) (double);
symrec *putsym (char const *, int);
symrec *getsym (char const *);
@end group
-@end smallexample
+@end example
The new version of @code{main} includes a call to @code{init_table}, a
function that initializes the symbol table. Here it is, and
@code{init_table} as well:
-@smallexample
+@example
#include <stdio.h>
@group
init_table (void)
@{
int i;
- symrec *ptr;
for (i = 0; arith_fncts[i].fname != 0; i++)
@{
- ptr = putsym (arith_fncts[i].fname, FNCT);
+ symrec *ptr = putsym (arith_fncts[i].fname, FNCT);
ptr->value.fnctptr = arith_fncts[i].fnct;
@}
@}
return yyparse ();
@}
@end group
-@end smallexample
+@end example
By simply editing the initialization list and adding the necessary include
files, you can add additional functions to the calculator.
The function @code{getsym} is passed the name of the symbol to look up. If
found, a pointer to that symbol is returned; otherwise zero is returned.
-@smallexample
+@comment file: mfcalc.y
+@example
+#include <stdlib.h> /* malloc. */
+#include <string.h> /* strlen. */
+
+@group
symrec *
putsym (char const *sym_name, int sym_type)
@{
- symrec *ptr;
- ptr = (symrec *) malloc (sizeof (symrec));
+ symrec *ptr = (symrec *) malloc (sizeof (symrec));
ptr->name = (char *) malloc (strlen (sym_name) + 1);
strcpy (ptr->name,sym_name);
ptr->type = sym_type;
sym_table = ptr;
return ptr;
@}
+@end group
+@group
symrec *
getsym (char const *sym_name)
@{
return ptr;
return 0;
@}
-@end smallexample
+@end group
+@end example
The function @code{yylex} must now recognize variables, numeric values, and
the single-character arithmetic operators. Strings of alphanumeric
No change is needed in the handling of numeric values and arithmetic
operators in @code{yylex}.
-@smallexample
+@comment file: mfcalc.y
+@example
@group
#include <ctype.h>
@end group
int c;
/* Ignore white space, get first nonwhite character. */
- while ((c = getchar ()) == ' ' || c == '\t');
+ while ((c = getchar ()) == ' ' || c == '\t')
+ continue;
if (c == EOF)
return 0;
/* Char starts an identifier => read the name. */
if (isalpha (c))
@{
- symrec *s;
+ /* Initially make the buffer long enough
+ for a 40-character symbol name. */
+ static size_t length = 40;
static char *symbuf = 0;
- static int length = 0;
+ symrec *s;
int i;
@end group
-@group
- /* Initially make the buffer long enough
- for a 40-character symbol name. */
- if (length == 0)
- length = 40, symbuf = (char *)malloc (length + 1);
+ if (!symbuf)
+ symbuf = (char *) malloc (length + 1);
i = 0;
do
-@end group
@group
@{
/* If buffer is full, make it bigger. */
return c;
@}
@end group
-@end smallexample
+@end example
This program is both powerful and flexible. You may easily add new
functions, and it is a simple job to modify this code to install
can be done with two @var{Prologue} blocks, one before and one after the
@code{%union} declaration.
-@smallexample
+@example
%@{
#define _GNU_SOURCE
#include <stdio.h>
%@}
@dots{}
-@end smallexample
+@end example
When in doubt, it is usually safer to put prologue code before all
Bison declarations, rather than after. For example, any definitions
Look again at the example of the previous section:
-@smallexample
+@example
%@{
#define _GNU_SOURCE
#include <stdio.h>
%@}
@dots{}
-@end smallexample
+@end example
@noindent
Notice that there are two @var{Prologue} sections here, but there's a
Let's go ahead and add the new @code{YYLTYPE} definition and the
@code{trace_token} prototype at the same time:
-@smallexample
+@example
%code top @{
#define _GNU_SOURCE
#include <stdio.h>
@}
@dots{}
-@end smallexample
+@end example
@noindent
In this way, @code{%code top} and the unqualified @code{%code} achieve the same
definitions.
Thus, they belong in one or more @code{%code requires}:
-@smallexample
+@example
+@group
%code top @{
#define _GNU_SOURCE
#include <stdio.h>
@}
+@end group
+@group
%code requires @{
#include "ptypes.h"
@}
+@end group
+@group
%union @{
long int n;
tree t; /* @r{@code{tree} is defined in @file{ptypes.h}.} */
@}
+@end group
+@group
%code requires @{
#define YYLTYPE YYLTYPE
typedef struct YYLTYPE
char *filename;
@} YYLTYPE;
@}
+@end group
+@group
%code @{
static void print_token_value (FILE *, int, YYSTYPE);
#define YYPRINT(F, N, L) print_token_value (F, N, L)
static void trace_token (enum yytokentype token, YYLTYPE loc);
@}
+@end group
@dots{}
-@end smallexample
+@end example
@noindent
Now Bison will insert @code{#include "ptypes.h"} and the new
sufficient. Instead, move its prototype from the unqualified
@code{%code} to a @code{%code provides}:
-@smallexample
+@example
+@group
%code top @{
#define _GNU_SOURCE
#include <stdio.h>
@}
+@end group
+@group
%code requires @{
#include "ptypes.h"
@}
+@end group
+@group
%union @{
long int n;
tree t; /* @r{@code{tree} is defined in @file{ptypes.h}.} */
@}
+@end group
+@group
%code requires @{
#define YYLTYPE YYLTYPE
typedef struct YYLTYPE
char *filename;
@} YYLTYPE;
@}
+@end group
+@group
%code provides @{
void trace_token (enum yytokentype token, YYLTYPE loc);
@}
+@end group
+@group
%code @{
static void print_token_value (FILE *, int, YYSTYPE);
#define YYPRINT(F, N, L) print_token_value (F, N, L)
@}
+@end group
@dots{}
-@end smallexample
+@end example
@noindent
Bison will insert the @code{trace_token} prototype into both the
For example, you may organize semantic-type-related directives by semantic
type:
-@smallexample
+@example
+@group
%code requires @{ #include "type1.h" @}
%union @{ type1 field1; @}
%destructor @{ type1_free ($$); @} <field1>
%printer @{ type1_print ($$); @} <field1>
+@end group
+@group
%code requires @{ #include "type2.h" @}
%union @{ type2 field2; @}
%destructor @{ type2_free ($$); @} <field2>
%printer @{ type2_print ($$); @} <field2>
-@end smallexample
+@end group
+@end example
@noindent
You could even place each of the above directive groups in the rules section of
@example
@group
-@var{result}: @var{components}@dots{}
- ;
+@var{result}: @var{components}@dots{};
@end group
@end example
@example
@group
-exp: exp '+' exp
- ;
+exp: exp '+' exp;
@end group
@end example
@example
@group
-@var{result}: @var{rule1-components}@dots{}
- | @var{rule2-components}@dots{}
- @dots{}
- ;
+@var{result}:
+ @var{rule1-components}@dots{}
+| @var{rule2-components}@dots{}
+@dots{}
+;
@end group
@end example
@example
@group
-expseq: /* empty */
- | expseq1
- ;
+expseq:
+ /* empty */
+| expseq1
+;
@end group
@group
-expseq1: exp
- | expseq1 ',' exp
- ;
+expseq1:
+ exp
+| expseq1 ',' exp
+;
@end group
@end example
@example
@group
-expseq1: exp
- | expseq1 ',' exp
- ;
+expseq1:
+ exp
+| expseq1 ',' exp
+;
@end group
@end example
@example
@group
-expseq1: exp
- | exp ',' expseq1
- ;
+expseq1:
+ exp
+| exp ',' expseq1
+;
@end group
@end example
@example
@group
-expr: primary
- | primary '+' primary
- ;
+expr:
+ primary
+| primary '+' primary
+;
@end group
@group
-primary: constant
- | '(' expr ')'
- ;
+primary:
+ constant
+| '(' expr ')'
+;
@end group
@end example
@example
@group
-exp: @dots{}
- | exp '+' exp
- @{ $$ = $1 + $3; @}
+exp:
+@dots{}
+| exp '+' exp @{ $$ = $1 + $3; @}
@end group
@end example
@example
@group
-exp[result]: @dots{}
- | exp[left] '+' exp[right]
- @{ $result = $left + $right; @}
+exp[result]:
+@dots{}
+| exp[left] '+' exp[right] @{ $result = $left + $right; @}
@end group
@end example
@example
@group
-foo: expr bar '+' expr @{ @dots{} @}
- | expr bar '-' expr @{ @dots{} @}
- ;
+foo:
+ expr bar '+' expr @{ @dots{} @}
+| expr bar '-' expr @{ @dots{} @}
+;
@end group
@group
-bar: /* empty */
- @{ previous_expr = $0; @}
- ;
+bar:
+ /* empty */ @{ previous_expr = $0; @}
+;
@end group
@end example
@example
@group
-exp: @dots{}
- | exp '+' exp
- @{ $$ = $1 + $3; @}
+exp:
+ @dots{}
+| exp '+' exp @{ $$ = $1 + $3; @}
@end group
@end example
@example
@group
-stmt: LET '(' var ')'
- @{ $<context>$ = push_context ();
- declare_variable ($3); @}
- stmt @{ $$ = $6;
- pop_context ($<context>5); @}
+stmt:
+ LET '(' var ')'
+ @{ $<context>$ = push_context (); declare_variable ($3); @}
+ stmt
+ @{ $$ = $6; pop_context ($<context>5); @}
@end group
@end example
%%
-stmt: let stmt
- @{ $$ = $2;
- pop_context ($1); @}
- ;
+stmt:
+ let stmt
+ @{
+ $$ = $2;
+ pop_context ($1);
+ @};
-let: LET '(' var ')'
- @{ $$ = push_context ();
- declare_variable ($3); @}
- ;
+let:
+ LET '(' var ')'
+ @{
+ $$ = push_context ();
+ declare_variable ($3);
+ @};
@end group
@end example
@example
@group
-compound: '@{' declarations statements '@}'
- | '@{' statements '@}'
- ;
+compound:
+ '@{' declarations statements '@}'
+| '@{' statements '@}'
+;
@end group
@end example
@example
@group
-compound: @{ prepare_for_local_variables (); @}
- '@{' declarations statements '@}'
+compound:
+ @{ prepare_for_local_variables (); @}
+ '@{' declarations statements '@}'
@end group
@group
- | '@{' statements '@}'
- ;
+| '@{' statements '@}'
+;
@end group
@end example
@example
@group
-compound: @{ prepare_for_local_variables (); @}
- '@{' declarations statements '@}'
- | @{ prepare_for_local_variables (); @}
- '@{' statements '@}'
- ;
+compound:
+ @{ prepare_for_local_variables (); @}
+ '@{' declarations statements '@}'
+| @{ prepare_for_local_variables (); @}
+ '@{' statements '@}'
+;
@end group
@end example
@example
@group
-compound: '@{' @{ prepare_for_local_variables (); @}
- declarations statements '@}'
- | '@{' statements '@}'
- ;
+compound:
+ '@{' @{ prepare_for_local_variables (); @}
+ declarations statements '@}'
+| '@{' statements '@}'
+;
@end group
@end example
@example
@group
-subroutine: /* empty */
- @{ prepare_for_local_variables (); @}
- ;
-
+subroutine:
+ /* empty */ @{ prepare_for_local_variables (); @}
+;
@end group
@group
-compound: subroutine
- '@{' declarations statements '@}'
- | subroutine
- '@{' statements '@}'
- ;
+compound:
+ subroutine '@{' declarations statements '@}'
+| subroutine '@{' statements '@}'
+;
@end group
@end example
@example
@group
-exp: @dots{}
- | exp '/' exp
- @{
- @@$.first_column = @@1.first_column;
- @@$.first_line = @@1.first_line;
- @@$.last_column = @@3.last_column;
- @@$.last_line = @@3.last_line;
- if ($3)
- $$ = $1 / $3;
- else
- @{
- $$ = 1;
- fprintf (stderr,
- "Division by zero, l%d,c%d-l%d,c%d",
- @@3.first_line, @@3.first_column,
- @@3.last_line, @@3.last_column);
- @}
- @}
+exp:
+ @dots{}
+| exp '/' exp
+ @{
+ @@$.first_column = @@1.first_column;
+ @@$.first_line = @@1.first_line;
+ @@$.last_column = @@3.last_column;
+ @@$.last_line = @@3.last_line;
+ if ($3)
+ $$ = $1 / $3;
+ else
+ @{
+ $$ = 1;
+ fprintf (stderr,
+ "Division by zero, l%d,c%d-l%d,c%d",
+ @@3.first_line, @@3.first_column,
+ @@3.last_line, @@3.last_column);
+ @}
+ @}
@end group
@end example
@example
@group
-exp: @dots{}
- | exp '/' exp
- @{
- if ($3)
- $$ = $1 / $3;
- else
- @{
- $$ = 1;
- fprintf (stderr,
- "Division by zero, l%d,c%d-l%d,c%d",
- @@3.first_line, @@3.first_column,
- @@3.last_line, @@3.last_column);
- @}
- @}
+exp:
+ @dots{}
+| exp '/' exp
+ @{
+ if ($3)
+ $$ = $1 / $3;
+ else
+ @{
+ $$ = 1;
+ fprintf (stderr,
+ "Division by zero, l%d,c%d-l%d,c%d",
+ @@3.first_line, @@3.first_column,
+ @@3.last_line, @@3.last_column);
+ @}
+ @}
@end group
@end example
By default, @code{YYLLOC_DEFAULT} is defined this way:
-@smallexample
+@example
@group
-# define YYLLOC_DEFAULT(Current, Rhs, N) \
- do \
- if (N) \
- @{ \
- (Current).first_line = YYRHSLOC(Rhs, 1).first_line; \
- (Current).first_column = YYRHSLOC(Rhs, 1).first_column; \
- (Current).last_line = YYRHSLOC(Rhs, N).last_line; \
- (Current).last_column = YYRHSLOC(Rhs, N).last_column; \
- @} \
- else \
- @{ \
- (Current).first_line = (Current).last_line = \
- YYRHSLOC(Rhs, 0).last_line; \
- (Current).first_column = (Current).last_column = \
- YYRHSLOC(Rhs, 0).last_column; \
- @} \
- while (0)
+# define YYLLOC_DEFAULT(Cur, Rhs, N) \
+do \
+ if (N) \
+ @{ \
+ (Cur).first_line = YYRHSLOC(Rhs, 1).first_line; \
+ (Cur).first_column = YYRHSLOC(Rhs, 1).first_column; \
+ (Cur).last_line = YYRHSLOC(Rhs, N).last_line; \
+ (Cur).last_column = YYRHSLOC(Rhs, N).last_column; \
+ @} \
+ else \
+ @{ \
+ (Cur).first_line = (Cur).last_line = \
+ YYRHSLOC(Rhs, 0).last_line; \
+ (Cur).first_column = (Cur).last_column = \
+ YYRHSLOC(Rhs, 0).last_column; \
+ @} \
+while (0)
@end group
-@end smallexample
+@end example
+@noindent
where @code{YYRHSLOC (rhs, k)} is the location of the @var{k}th symbol
in @var{rhs} when @var{k} is positive, and the location of the symbol
just before the reduction when @var{k} and @var{n} are both zero.
@noindent
For example:
-@smallexample
+@example
%union @{ char *string; @}
%token <string> STRING1
%token <string> STRING2
%destructor @{ free ($$); @} <*>
%destructor @{ free ($$); printf ("%d", @@$.first_line); @} STRING1 string1
%destructor @{ printf ("Discarding tagless symbol.\n"); @} <>
-@end smallexample
+@end example
@noindent
guarantees that, when the parser discards any user-defined symbol that has a
However, it may invoke one of them for the end token (token 0) if you
redefine it from @code{$end} to, for example, @code{END}:
-@smallexample
+@example
%token END 0
-@end smallexample
+@end example
@cindex actions in mid-rule
@cindex mid-rule actions
Some of the accepted @var{variable}s are:
@itemize @bullet
+@c ================================================== api.pure
@item api.pure
@findex %define api.pure
occasionally it is necessary to insert code much nearer the top of the
parser implementation file. For example:
-@smallexample
+@example
%code top @{
#define _GNU_SOURCE
#include <stdio.h>
@}
-@end smallexample
+@end example
@item Location(s): Near the top of the parser implementation file.
@end itemize
@code{token_buffer}, and assuming that the token does not contain any
characters like @samp{"} that require escaping.
-@smallexample
+@example
for (i = 0; i < YYNTOKENS; i++)
@{
if (yytname[i] != 0
&& yytname[i][strlen (token_buffer) + 2] == 0)
break;
@}
-@end smallexample
+@end example
The @code{yytname} table is generated only if you use the
@code{%token-table} declaration. @xref{Decl Summary}.
@example
@group
-expr: term '+' expr
- | term
- ;
+expr:
+ term '+' expr
+| term
+;
@end group
@group
-term: '(' expr ')'
- | term '!'
- | NUMBER
- ;
+term:
+ '(' expr ')'
+| term '!'
+| NUMBER
+;
@end group
@end example
@example
@group
if_stmt:
- IF expr THEN stmt
- | IF expr THEN stmt ELSE stmt
- ;
+ IF expr THEN stmt
+| IF expr THEN stmt ELSE stmt
+;
@end group
@end example
%%
@end group
@group
-stmt: expr
- | if_stmt
- ;
+stmt:
+ expr
+| if_stmt
+;
@end group
@group
if_stmt:
- IF expr THEN stmt
- | IF expr THEN stmt ELSE stmt
- ;
+ IF expr THEN stmt
+| IF expr THEN stmt ELSE stmt
+;
@end group
-expr: variable
- ;
+expr:
+ variable
+;
@end example
@node Precedence
@example
@group
-expr: expr '-' expr
- | expr '*' expr
- | expr '<' expr
- | '(' expr ')'
- @dots{}
- ;
+expr:
+ expr '-' expr
+| expr '*' expr
+| expr '<' expr
+| '(' expr ')'
+@dots{}
+;
@end group
@end example
@example
@group
-exp: @dots{}
- | exp '-' exp
- @dots{}
- | '-' exp %prec UMINUS
+exp:
+ @dots{}
+| exp '-' exp
+ @dots{}
+| '-' exp %prec UMINUS
@end group
@end example
of zero or more @code{word} groupings.
@example
-sequence: /* empty */
- @{ printf ("empty sequence\n"); @}
- | maybeword
- | sequence word
- @{ printf ("added word %s\n", $2); @}
- ;
+@group
+sequence:
+ /* empty */ @{ printf ("empty sequence\n"); @}
+| maybeword
+| sequence word @{ printf ("added word %s\n", $2); @}
+;
+@end group
-maybeword: /* empty */
- @{ printf ("empty maybeword\n"); @}
- | word
- @{ printf ("single word %s\n", $1); @}
- ;
+@group
+maybeword:
+ /* empty */ @{ printf ("empty maybeword\n"); @}
+| word @{ printf ("single word %s\n", $1); @}
+;
+@end group
@end example
@noindent
proper way to define @code{sequence}:
@example
-sequence: /* empty */
- @{ printf ("empty sequence\n"); @}
- | sequence word
- @{ printf ("added word %s\n", $2); @}
- ;
+sequence:
+ /* empty */ @{ printf ("empty sequence\n"); @}
+| sequence word @{ printf ("added word %s\n", $2); @}
+;
@end example
Here is another common error that yields a reduce/reduce conflict:
@example
-sequence: /* empty */
- | sequence words
- | sequence redirects
- ;
+sequence:
+ /* empty */
+| sequence words
+| sequence redirects
+;
-words: /* empty */
- | words word
- ;
+words:
+ /* empty */
+| words word
+;
-redirects:/* empty */
- | redirects redirect
- ;
+redirects:
+ /* empty */
+| redirects redirect
+;
@end example
@noindent
of sequence:
@example
-sequence: /* empty */
- | sequence word
- | sequence redirect
- ;
+sequence:
+ /* empty */
+| sequence word
+| sequence redirect
+;
@end example
Second, to prevent either a @code{words} or a @code{redirects}
from being empty:
@example
-sequence: /* empty */
- | sequence words
- | sequence redirects
- ;
+@group
+sequence:
+ /* empty */
+| sequence words
+| sequence redirects
+;
+@end group
-words: word
- | words word
- ;
+@group
+words:
+ word
+| words word
+;
+@end group
-redirects:redirect
- | redirects redirect
- ;
+@group
+redirects:
+ redirect
+| redirects redirect
+;
+@end group
@end example
@node Mysterious Conflicts
%token ID
%%
-def: param_spec return_spec ','
- ;
+def: param_spec return_spec ',';
param_spec:
- type
- | name_list ':' type
- ;
+ type
+| name_list ':' type
+;
@end group
@group
return_spec:
- type
- | name ':' type
- ;
+ type
+| name ':' type
+;
@end group
@group
-type: ID
- ;
+type: ID;
@end group
@group
-name: ID
- ;
+name: ID;
name_list:
- name
- | name ',' name_list
- ;
+ name
+| name ',' name_list
+;
@end group
@end example
%%
@dots{}
return_spec:
- type
- | name ':' type
- /* This rule is never used. */
- | ID BOGUS
- ;
+ type
+| name ':' type
+| ID BOGUS /* This rule is never used. */
+;
@end group
@end example
@example
param_spec:
- type
- | name_list ':' type
- ;
+ type
+| name_list ':' type
+;
return_spec:
- type
- | ID ':' type
- ;
+ type
+| ID ':' type
+;
@end example
For a more detailed exposition of LALR(1) parsers and parser
parse. Finally, the base of the temporary stack used during an exploratory
parse is a pointer into the normal parser state stack so that the stack is
never physically copied. In our experience, the performance penalty of LAC
-has proven insignificant for practical grammars.
+has proved insignificant for practical grammars.
@end itemize
While the LAC algorithm shares techniques that have been recognized in the
For example:
@example
-stmnts: /* empty string */
- | stmnts '\n'
- | stmnts exp '\n'
- | stmnts error '\n'
+stmts:
+ /* empty string */
+| stmts '\n'
+| stmts exp '\n'
+| stmts error '\n'
@end example
The fourth rule in this example says that an error followed by a newline
-makes a valid addition to any @code{stmnts}.
+makes a valid addition to any @code{stmts}.
What happens if a syntax error occurs in the middle of an @code{exp}? The
error recovery rule, interpreted strictly, applies to the precise sequence
-of a @code{stmnts}, an @code{error} and a newline. If an error occurs in
+of a @code{stmts}, an @code{error} and a newline. If an error occurs in
the middle of an @code{exp}, there will probably be some additional tokens
-and subexpressions on the stack after the last @code{stmnts}, and there
+and subexpressions on the stack after the last @code{stmts}, and there
will be tokens to read before the next newline. So the rule is not
applicable in the ordinary way.
the semantic context and part of the input. First it discards states
and objects from the stack until it gets back to a state in which the
@code{error} token is acceptable. (This means that the subexpressions
-already parsed are discarded, back to the last complete @code{stmnts}.)
+already parsed are discarded, back to the last complete @code{stmts}.)
At this point the @code{error} token can be shifted. Then, if the old
lookahead token is not acceptable to be shifted next, the parser reads
tokens and discards them until it finds a token which is acceptable. In
the current input line or current statement if an error is detected:
@example
-stmnt: error ';' /* On error, skip until ';' is read. */
+stmt: error ';' /* On error, skip until ';' is read. */
@end example
It is also useful to recover to the matching close-delimiter of an
spurious error message:
@example
-primary: '(' expr ')'
- | '(' error ')'
- @dots{}
- ;
+primary:
+ '(' expr ')'
+| '(' error ')'
+@dots{}
+;
@end example
Error recovery strategies are necessarily guesses. When they guess wrong,
one syntax error often leads to another. In the above example, the error
recovery rule guesses that an error is due to bad input within one
-@code{stmnt}. Suppose that instead a spurious semicolon is inserted in the
-middle of a valid @code{stmnt}. After the error recovery rule recovers
+@code{stmt}. Suppose that instead a spurious semicolon is inserted in the
+middle of a valid @code{stmt}. After the error recovery rule recovers
from the first error, another syntax error will be found straightaway,
since the text following the spurious semicolon is also an invalid
-@code{stmnt}.
+@code{stmt}.
To prevent an outpouring of error messages, the parser will output no error
message for another syntax error that happens shortly after the first; only
@example
typedef int foo, bar;
int baz (void)
+@group
@{
static bar (bar); /* @r{redeclare @code{bar} as static variable} */
extern foo foo (foo); /* @r{redeclare @code{foo} as function} */
return foo (bar);
@}
+@end group
@end example
Unfortunately, the name being declared is separated from the declaration
duplication, with actions omitted for brevity:
@example
+@group
initdcl:
- declarator maybeasm '='
- init
- | declarator maybeasm
- ;
+ declarator maybeasm '=' init
+| declarator maybeasm
+;
+@end group
+@group
notype_initdcl:
- notype_declarator maybeasm '='
- init
- | notype_declarator maybeasm
- ;
+ notype_declarator maybeasm '=' init
+| notype_declarator maybeasm
+;
+@end group
@end example
@noindent
@dots{}
@end group
@group
-expr: IDENTIFIER
- | constant
- | HEX '('
- @{ hexflag = 1; @}
- expr ')'
- @{ hexflag = 0;
- $$ = $4; @}
- | expr '+' expr
- @{ $$ = make_sum ($1, $3); @}
- @dots{}
- ;
+expr:
+ IDENTIFIER
+| constant
+| HEX '(' @{ hexflag = 1; @}
+ expr ')' @{ hexflag = 0; $$ = $4; @}
+| expr '+' expr @{ $$ = make_sum ($1, $3); @}
+@dots{}
+;
@end group
@group
constant:
- INTEGER
- | STRING
- ;
+ INTEGER
+| STRING
+;
@end group
@end example
tokens until the next semicolon, and then start a new statement, like this:
@example
-stmt: expr ';'
- | IF '(' expr ')' stmt @{ @dots{} @}
- @dots{}
- error ';'
- @{ hexflag = 0; @}
- ;
+stmt:
+ expr ';'
+| IF '(' expr ')' stmt @{ @dots{} @}
+@dots{}
+| error ';' @{ hexflag = 0; @}
+;
@end example
If there is a syntax error in the middle of a @samp{hex (@var{expr})}
@example
@group
-expr: @dots{}
- | '(' expr ')'
- @{ $$ = $2; @}
- | '(' error ')'
- @dots{}
+expr:
+ @dots{}
+| '(' expr ')' @{ $$ = $2; @}
+| '(' error ')'
+@dots{}
@end group
@end example
%left '+' '-'
%left '*'
%%
-exp: exp '+' exp
- | exp '-' exp
- | exp '*' exp
- | exp '/' exp
- | NUM
- ;
+exp:
+ exp '+' exp
+| exp '-' exp
+| exp '*' exp
+| exp '/' exp
+| NUM
+;
useless: STR;
%%
@end example
order of the output and the exact presentation might vary, but the
interpretation is the same.
-The first section includes details on conflicts that were solved thanks
-to precedence and/or associativity:
-
-@example
-Conflict in state 8 between rule 2 and token '+' resolved as reduce.
-Conflict in state 8 between rule 2 and token '-' resolved as reduce.
-Conflict in state 8 between rule 2 and token '*' resolved as shift.
-@exdent @dots{}
-@end example
-
-@noindent
-The next section lists states that still have conflicts.
-
-@example
-State 8 conflicts: 1 shift/reduce
-State 9 conflicts: 1 shift/reduce
-State 10 conflicts: 1 shift/reduce
-State 11 conflicts: 4 shift/reduce
-@end example
-
@noindent
@cindex token, useless
@cindex useless token
@cindex useless nonterminal
@cindex rule, useless
@cindex useless rule
-The next section reports useless tokens, nonterminal and rules. Useless
-nonterminals and rules are removed in order to produce a smaller parser,
-but useless tokens are preserved, since they might be used by the
-scanner (note the difference between ``useless'' and ``unused''
-below):
+The first section reports useless tokens, nonterminals and rules. Useless
+nonterminals and rules are removed in order to produce a smaller parser, but
+useless tokens are preserved, since they might be used by the scanner (note
+the difference between ``useless'' and ``unused'' below):
@example
-Nonterminals useless in grammar:
+Nonterminals useless in grammar
useless
-Terminals unused in grammar:
+Terminals unused in grammar
STR
-Rules useless in grammar:
-#6 useless: STR;
+Rules useless in grammar
+ 6 useless: STR
@end example
@noindent
-The next section reproduces the exact grammar that Bison used:
+The next section lists states that still have conflicts.
+
+@example
+State 8 conflicts: 1 shift/reduce
+State 9 conflicts: 1 shift/reduce
+State 10 conflicts: 1 shift/reduce
+State 11 conflicts: 4 shift/reduce
+@end example
+
+@noindent
+Then Bison reproduces the exact grammar it used:
@example
Grammar
- Number, Line, Rule
- 0 5 $accept -> exp $end
- 1 5 exp -> exp '+' exp
- 2 6 exp -> exp '-' exp
- 3 7 exp -> exp '*' exp
- 4 8 exp -> exp '/' exp
- 5 9 exp -> NUM
+ 0 $accept: exp $end
+
+ 1 exp: exp '+' exp
+ 2 | exp '-' exp
+ 3 | exp '*' exp
+ 4 | exp '/' exp
+ 5 | NUM
@end example
@noindent
and reports the uses of the symbols:
@example
+@group
Terminals, with rules where they appear
$end (0) 0
'/' (47) 4
error (256)
NUM (258) 5
+STR (259)
+@end group
+@group
Nonterminals, with rules where they appear
-$accept (8)
+$accept (9)
on left: 0
-exp (9)
+exp (10)
on left: 1 2 3 4 5, on right: 0 1 2 3 4
+@end group
@end example
@noindent
@cindex pointed rule
@cindex rule, pointed
Bison then proceeds onto the automaton itself, describing each state
-with it set of @dfn{items}, also known as @dfn{pointed rules}. Each
-item is a production rule together with a point (marked by @samp{.})
-that the input cursor.
+with its set of @dfn{items}, also known as @dfn{pointed rules}. Each
+item is a production rule together with a point (@samp{.}) marking
+the location of the input cursor.
@example
state 0
- $accept -> . exp $ (rule 0)
+ 0 $accept: . exp $end
- NUM shift, and go to state 1
+ NUM shift, and go to state 1
- exp go to state 2
+ exp go to state 2
@end example
This reads as follows: ``state 0 corresponds to being at the very
symbol (here, @code{exp}). When the parser returns to this state right
after having reduced a rule that produced an @code{exp}, the control
flow jumps to state 2. If there is no such transition on a nonterminal
-symbol, and the lookahead is a @code{NUM}, then this token is shifted on
+symbol, and the lookahead is a @code{NUM}, then this token is shifted onto
the parse stack, and the control flow jumps to state 1. Any other
lookahead triggers a syntax error.''
at the beginning of any rule deriving an @code{exp}. By default Bison
reports the so-called @dfn{core} or @dfn{kernel} of the item set, but if
you want to see more detail you can invoke @command{bison} with
-@option{--report=itemset} to list all the items, include those that can
-be derived:
+@option{--report=itemset} to list the derived items as well:
@example
state 0
- $accept -> . exp $ (rule 0)
- exp -> . exp '+' exp (rule 1)
- exp -> . exp '-' exp (rule 2)
- exp -> . exp '*' exp (rule 3)
- exp -> . exp '/' exp (rule 4)
- exp -> . NUM (rule 5)
+ 0 $accept: . exp $end
+ 1 exp: . exp '+' exp
+ 2 | . exp '-' exp
+ 3 | . exp '*' exp
+ 4 | . exp '/' exp
+ 5 | . NUM
- NUM shift, and go to state 1
+ NUM shift, and go to state 1
- exp go to state 2
+ exp go to state 2
@end example
@noindent
-In the state 1...
+In the state 1@dots{}
@example
state 1
- exp -> NUM . (rule 5)
+ 5 exp: NUM .
- $default reduce using rule 5 (exp)
+ $default reduce using rule 5 (exp)
@end example
@noindent
@example
state 2
- $accept -> exp . $ (rule 0)
- exp -> exp . '+' exp (rule 1)
- exp -> exp . '-' exp (rule 2)
- exp -> exp . '*' exp (rule 3)
- exp -> exp . '/' exp (rule 4)
+ 0 $accept: exp . $end
+ 1 exp: exp . '+' exp
+ 2 | exp . '-' exp
+ 3 | exp . '*' exp
+ 4 | exp . '/' exp
- $ shift, and go to state 3
- '+' shift, and go to state 4
- '-' shift, and go to state 5
- '*' shift, and go to state 6
- '/' shift, and go to state 7
+ $end shift, and go to state 3
+ '+' shift, and go to state 4
+ '-' shift, and go to state 5
+ '*' shift, and go to state 6
+ '/' shift, and go to state 7
@end example
@noindent
In state 2, the automaton can only shift a symbol. For instance,
-because of the item @samp{exp -> exp . '+' exp}, if the lookahead if
-@samp{+}, it will be shifted on the parse stack, and the automaton
-control will jump to state 4, corresponding to the item @samp{exp -> exp
-'+' . exp}. Since there is no default action, any other token than
-those listed above will trigger a syntax error.
+because of the item @samp{exp: exp . '+' exp}, if the lookahead is
+@samp{+} it is shifted onto the parse stack, and the automaton
+jumps to state 4, corresponding to the item @samp{exp: exp '+' . exp}.
+Since there is no default action, any lookahead not listed triggers a syntax
+error.
@cindex accepting state
The state 3 is named the @dfn{final state}, or the @dfn{accepting
@example
state 3
- $accept -> exp $ . (rule 0)
+ 0 $accept: exp $end .
- $default accept
+ $default accept
@end example
@noindent
-the initial rule is completed (the start symbol and the end
-of input were read), the parsing exits successfully.
+the initial rule is completed (the start symbol and the end-of-input were
+read), the parsing exits successfully.
The interpretation of states 4 to 7 is straightforward, and is left to
the reader.
@example
state 4
- exp -> exp '+' . exp (rule 1)
+ 1 exp: exp '+' . exp
- NUM shift, and go to state 1
+ NUM shift, and go to state 1
+
+ exp go to state 8
- exp go to state 8
state 5
- exp -> exp '-' . exp (rule 2)
+ 2 exp: exp '-' . exp
+
+ NUM shift, and go to state 1
- NUM shift, and go to state 1
+ exp go to state 9
- exp go to state 9
state 6
- exp -> exp '*' . exp (rule 3)
+ 3 exp: exp '*' . exp
+
+ NUM shift, and go to state 1
- NUM shift, and go to state 1
+ exp go to state 10
- exp go to state 10
state 7
- exp -> exp '/' . exp (rule 4)
+ 4 exp: exp '/' . exp
- NUM shift, and go to state 1
+ NUM shift, and go to state 1
- exp go to state 11
+ exp go to state 11
@end example
As was announced in beginning of the report, @samp{State 8 conflicts:
@example
state 8
- exp -> exp . '+' exp (rule 1)
- exp -> exp '+' exp . (rule 1)
- exp -> exp . '-' exp (rule 2)
- exp -> exp . '*' exp (rule 3)
- exp -> exp . '/' exp (rule 4)
+ 1 exp: exp . '+' exp
+ 1 | exp '+' exp .
+ 2 | exp . '-' exp
+ 3 | exp . '*' exp
+ 4 | exp . '/' exp
- '*' shift, and go to state 6
- '/' shift, and go to state 7
+ '*' shift, and go to state 6
+ '/' shift, and go to state 7
- '/' [reduce using rule 1 (exp)]
- $default reduce using rule 1 (exp)
+ '/' [reduce using rule 1 (exp)]
+ $default reduce using rule 1 (exp)
@end example
Indeed, there are two actions associated to the lookahead @samp{/}:
Because in deterministic parsing a single decision can be made, Bison
arbitrarily chose to disable the reduction, see @ref{Shift/Reduce, ,
-Shift/Reduce Conflicts}. Discarded actions are reported in between
+Shift/Reduce Conflicts}. Discarded actions are reported between
square brackets.
Note that all the previous states had a single possible action: either
@example
state 8
- exp -> exp . '+' exp (rule 1)
- exp -> exp '+' exp . [$, '+', '-', '/'] (rule 1)
- exp -> exp . '-' exp (rule 2)
- exp -> exp . '*' exp (rule 3)
- exp -> exp . '/' exp (rule 4)
+ 1 exp: exp . '+' exp
+ 1 | exp '+' exp . [$end, '+', '-', '/']
+ 2 | exp . '-' exp
+ 3 | exp . '*' exp
+ 4 | exp . '/' exp
- '*' shift, and go to state 6
- '/' shift, and go to state 7
+ '*' shift, and go to state 6
+ '/' shift, and go to state 7
- '/' [reduce using rule 1 (exp)]
- $default reduce using rule 1 (exp)
+ '/' [reduce using rule 1 (exp)]
+ $default reduce using rule 1 (exp)
@end example
+Note however that while @samp{NUM + NUM / NUM} is ambiguous (which results in
+the conflicts on @samp{/}), @samp{NUM + NUM * NUM} is not: the conflict was
+solved thanks to associativity and precedence directives. If invoked with
+@option{--report=solved}, Bison includes information about the solved
+conflicts in the report:
+
+@example
+Conflict between rule 1 and token '+' resolved as reduce (%left '+').
+Conflict between rule 1 and token '-' resolved as reduce (%left '-').
+Conflict between rule 1 and token '*' resolved as shift ('+' < '*').
+@end example
+
+
The remaining states are similar:
@example
+@group
state 9
- exp -> exp . '+' exp (rule 1)
- exp -> exp . '-' exp (rule 2)
- exp -> exp '-' exp . (rule 2)
- exp -> exp . '*' exp (rule 3)
- exp -> exp . '/' exp (rule 4)
+ 1 exp: exp . '+' exp
+ 2 | exp . '-' exp
+ 2 | exp '-' exp .
+ 3 | exp . '*' exp
+ 4 | exp . '/' exp
- '*' shift, and go to state 6
- '/' shift, and go to state 7
+ '*' shift, and go to state 6
+ '/' shift, and go to state 7
- '/' [reduce using rule 2 (exp)]
- $default reduce using rule 2 (exp)
+ '/' [reduce using rule 2 (exp)]
+ $default reduce using rule 2 (exp)
+@end group
+@group
state 10
- exp -> exp . '+' exp (rule 1)
- exp -> exp . '-' exp (rule 2)
- exp -> exp . '*' exp (rule 3)
- exp -> exp '*' exp . (rule 3)
- exp -> exp . '/' exp (rule 4)
+ 1 exp: exp . '+' exp
+ 2 | exp . '-' exp
+ 3 | exp . '*' exp
+ 3 | exp '*' exp .
+ 4 | exp . '/' exp
- '/' shift, and go to state 7
+ '/' shift, and go to state 7
- '/' [reduce using rule 3 (exp)]
- $default reduce using rule 3 (exp)
+ '/' [reduce using rule 3 (exp)]
+ $default reduce using rule 3 (exp)
+@end group
+@group
state 11
- exp -> exp . '+' exp (rule 1)
- exp -> exp . '-' exp (rule 2)
- exp -> exp . '*' exp (rule 3)
- exp -> exp . '/' exp (rule 4)
- exp -> exp '/' exp . (rule 4)
-
- '+' shift, and go to state 4
- '-' shift, and go to state 5
- '*' shift, and go to state 6
- '/' shift, and go to state 7
-
- '+' [reduce using rule 4 (exp)]
- '-' [reduce using rule 4 (exp)]
- '*' [reduce using rule 4 (exp)]
- '/' [reduce using rule 4 (exp)]
- $default reduce using rule 4 (exp)
+ 1 exp: exp . '+' exp
+ 2 | exp . '-' exp
+ 3 | exp . '*' exp
+ 4 | exp . '/' exp
+ 4 | exp '/' exp .
+
+ '+' shift, and go to state 4
+ '-' shift, and go to state 5
+ '*' shift, and go to state 6
+ '/' shift, and go to state 7
+
+ '+' [reduce using rule 4 (exp)]
+ '-' [reduce using rule 4 (exp)]
+ '*' [reduce using rule 4 (exp)]
+ '/' [reduce using rule 4 (exp)]
+ $default reduce using rule 4 (exp)
+@end group
@end example
@noindent
Here is an example of @code{YYPRINT} suitable for the multi-function
calculator (@pxref{Mfcalc Declarations, ,Declarations for @code{mfcalc}}):
-@smallexample
+@example
%@{
static void print_token_value (FILE *, int, YYSTYPE);
- #define YYPRINT(file, type, value) print_token_value (file, type, value)
+ #define YYPRINT(file, type, value) \
+ print_token_value (file, type, value)
%@}
@dots{} %% @dots{} %% @dots{}
else if (type == NUM)
fprintf (file, "%d", value.val);
@}
-@end smallexample
+@end example
@c ================================================= Invoking Bison
For example, warn about unset @code{$$} in the mid-rule action in:
@example
- exp: '1' @{ $1 = 1; @} '+' exp @{ $$ = $2 + $4; @};
+exp: '1' @{ $1 = 1; @} '+' exp @{ $$ = $2 + $4; @};
@end example
These warnings are not enabled by default since they sometimes prove to
@end defcv
@defcv {Type} {parser} {token}
-A structure that contains (only) the definition of the tokens as the
-@code{yytokentype} enumeration. To refer to the token @code{FOO}, the
-scanner should use @code{yy::parser::token::FOO}. The scanner can use
+A structure that contains (only) the @code{yytokentype} enumeration, which
+defines the tokens. To refer to the token @code{FOO},
+use @code{yy::parser::token::FOO}. The scanner can use
@samp{typedef yy::parser::token token;} to ``import'' the token enumeration
(@pxref{Calc++ Scanner}).
@end defcv
@comment file: calc++-parser.yy
@example
-%skeleton "lalr1.cc" /* -*- C++ -*- */
+%skeleton "lalr1.cc" /* -*- C++ -*- */
%require "@value{VERSION}"
%defines
%define parser_class_name "calcxx_parser"
%start unit;
unit: assignments exp @{ driver.result = $2; @};
-assignments: assignments assignment @{@}
- | /* Nothing. */ @{@};
+assignments:
+ /* Nothing. */ @{@}
+| assignments assignment @{@};
assignment:
"identifier" ":=" exp
@comment file: calc++-scanner.ll
@example
-%@{ /* -*- C++ -*- */
+%@{ /* -*- C++ -*- */
# include <cstdlib>
# include <cerrno>
# include <climits>
@comment file: calc++-scanner.ll
@example
+@group
%@{
# define YY_USER_ACTION yylloc->columns (yyleng);
%@}
+@end group
%%
%@{
yylloc->step ();
@comment file: calc++-scanner.ll
@example
+@group
void
calcxx_driver::scan_begin ()
@{
yyin = stdin;
else if (!(yyin = fopen (file.c_str (), "r")))
@{
- error (std::string ("cannot open ") + file + ": " + strerror(errno));
+ error ("cannot open " + file + ": " + strerror(errno));
exit (EXIT_FAILURE);
@}
@}
+@end group
+@group
void
calcxx_driver::scan_end ()
@{
fclose (yyin);
@}
+@end group
@end example
@node Calc++ Top Level
#include <iostream>
#include "calc++-driver.hh"
+@group
int
main (int argc, char *argv[])
@{
else if (!driver.parse (*argv))
std::cout << driver.result << std::endl;
@}
+@end group
@end example
@node Java Parsers
@node Memory Exhausted
@section Memory Exhausted
-@display
+@quotation
My parser returns with error with a @samp{memory exhausted}
message. What can I do?
-@end display
+@end quotation
This question is already addressed elsewhere, @xref{Recursion,
,Recursive Rules}.
The following phenomenon has several symptoms, resulting in the
following typical questions:
-@display
+@quotation
I invoke @code{yyparse} several times, and on correct input it works
properly; but when a parse error is found, all the other calls fail
too. How can I reset the error flag of @code{yyparse}?
-@end display
+@end quotation
@noindent
or
-@display
+@quotation
My parser includes support for an @samp{#include}-like feature, in
which case I run @code{yyparse} from @code{yyparse}. This fails
-although I did specify @code{%define api.pure}.
-@end display
+although I did specify @samp{%define api.pure}.
+@end quotation
These problems typically come not from Bison itself, but from
Lex-generated scanners. Because these scanners use large buffers for
demonstration, consider the following source file,
@file{first-line.l}:
-@verbatim
-%{
+@example
+@group
+%@{
#include <stdio.h>
#include <stdlib.h>
-%}
+%@}
+@end group
%%
.*\n ECHO; return 1;
%%
+@group
int
yyparse (char const *file)
-{
+@{
yyin = fopen (file, "r");
if (!yyin)
- {
- perror ("fopen");
- exit (EXIT_FAILURE);
- }
+ @{
+ perror ("fopen");
+ exit (EXIT_FAILURE);
+ @}
+@end group
+@group
/* One token only. */
yylex ();
if (fclose (yyin) != 0)
- {
- perror ("fclose");
- exit (EXIT_FAILURE);
- }
+ @{
+ perror ("fclose");
+ exit (EXIT_FAILURE);
+ @}
return 0;
-}
+@}
+@end group
+@group
int
main (void)
-{
+@{
yyparse ("input");
yyparse ("input");
return 0;
-}
-@end verbatim
+@}
+@end group
+@end example
@noindent
If the file @file{input} contains
-@verbatim
+@example
input:1: Hello,
input:2: World!
-@end verbatim
+@end example
@noindent
then instead of getting the first line twice, you get:
@node Strings are Destroyed
@section Strings are Destroyed
-@display
+@quotation
My parser seems to destroy old strings, or maybe it loses track of
them. Instead of reporting @samp{"foo", "bar"}, it reports
@samp{"bar", "bar"}, or even @samp{"foo\nbar", "bar"}.
-@end display
+@end quotation
This error is probably the single most frequent ``bug report'' sent to
Bison lists, but is only concerned with a misunderstanding of the role
of the scanner. Consider the following Lex code:
-@verbatim
-%{
+@example
+@group
+%@{
#include <stdio.h>
char *yylval = NULL;
-%}
+%@}
+@end group
+@group
%%
.* yylval = yytext; return 1;
\n /* IGNORE */
%%
+@end group
+@group
int
main ()
-{
+@{
/* Similar to using $1, $2 in a Bison action. */
char *fst = (yylex (), yylval);
char *snd = (yylex (), yylval);
printf ("\"%s\", \"%s\"\n", fst, snd);
return 0;
-}
-@end verbatim
+@}
+@end group
+@end example
If you compile and run this code, you get:
@node Implementing Gotos/Loops
@section Implementing Gotos/Loops
-@display
+@quotation
My simple calculator supports variables, assignments, and functions,
but how can I implement gotos, or loops?
-@end display
+@end quotation
Although very pedagogical, the examples included in the document blur
the distinction to make between the parser---whose job is to recover
@node Multiple start-symbols
@section Multiple start-symbols
-@display
+@quotation
I have several closely related grammars, and I would like to share their
implementations. In fact, I could use a single grammar but with
multiple entry points.
-@end display
+@end quotation
Bison does not support multiple start-symbols, but there is a very
simple means to simulate them. If @code{foo} and @code{bar} are the two
@example
%token START_FOO START_BAR;
%start start;
-start: START_FOO foo
- | START_BAR bar;
+start:
+ START_FOO foo
+| START_BAR bar;
@end example
These tokens prevents the introduction of new conflicts. As far as the
@node Secure? Conform?
@section Secure? Conform?
-@display
+@quotation
Is Bison secure? Does it conform to POSIX?
-@end display
+@end quotation
If you're looking for a guarantee or certification, we don't provide it.
However, Bison is intended to be a reliable program that conforms to the
@node I can't build Bison
@section I can't build Bison
-@display
+@quotation
I can't build Bison because @command{make} complains that
@code{msgfmt} is not found.
What should I do?
-@end display
+@end quotation
Like most GNU packages with internationalization support, that feature
is turned on by default. If you have problems building in the @file{po}
@node Where can I find help?
@section Where can I find help?
-@display
+@quotation
I'm having trouble using Bison. Where can I find help?
-@end display
+@end quotation
First, read this fine manual. Beyond that, you can send mail to
@email{help-bison@@gnu.org}. This mailing list is intended to be
@node Bug Reports
@section Bug Reports
-@display
+@quotation
I found a bug. What should I include in the bug report?
-@end display
+@end quotation
Before you send a bug report, make sure you are using the latest
version. Check @url{ftp://ftp.gnu.org/pub/gnu/bison/} or one of its
@node More Languages
@section More Languages
-@display
+@quotation
Will Bison ever have C++ and Java support? How about @var{insert your
favorite language here}?
-@end display
+@end quotation
C++ and Java support is there now, and is documented. We'd love to add other
languages; contributions are welcome.
@node Beta Testing
@section Beta Testing
-@display
+@quotation
What is involved in being a beta tester?
-@end display
+@end quotation
It's not terribly involved. Basically, you would download a test
release, compile it, and use it to build and run a parser or two. After
@node Mailing Lists
@section Mailing Lists
-@display
+@quotation
How do I join the help-bison and bug-bison mailing lists?
-@end display
+@end quotation
See @url{http://lists.gnu.org/}.
@c LocalWords: NUM exp subsubsection kbd Ctrl ctype EOF getchar isdigit nonfree
@c LocalWords: ungetc stdin scanf sc calc ulator ls lm cc NEG prec yyerrok rr
@c LocalWords: longjmp fprintf stderr yylloc YYLTYPE cos ln Stallman Destructor
-@c LocalWords: smallexample symrec val tptr FNCT fnctptr func struct sym enum
+@c LocalWords: symrec val tptr FNCT fnctptr func struct sym enum IEC syntaxes
@c LocalWords: fnct putsym getsym fname arith fncts atan ptr malloc sizeof Lex
@c LocalWords: strlen strcpy fctn strcmp isalpha symbuf realloc isalnum DOTDOT
@c LocalWords: ptypes itype YYPRINT trigraphs yytname expseq vindex dtype Unary
@c LocalWords: strncmp intval tindex lvalp locp llocp typealt YYBACKUP subrange
@c LocalWords: YYEMPTY YYEOF YYRECOVERING yyclearin GE def UMINUS maybeword loc
@c LocalWords: Johnstone Shamsa Sadaf Hussain Tomita TR uref YYMAXDEPTH inline
-@c LocalWords: YYINITDEPTH stmnts ref stmnt initdcl maybeasm notype Lookahead
+@c LocalWords: YYINITDEPTH stmts ref initdcl maybeasm notype Lookahead yyoutput
@c LocalWords: hexflag STR exdent itemset asis DYYDEBUG YYFPRINTF args Autoconf
@c LocalWords: infile ypp yxx outfile itemx tex leaderfill Troubleshouting sqrt
@c LocalWords: hbox hss hfill tt ly yyin fopen fclose ofirst gcc ll lookahead
@c LocalWords: nbar yytext fst snd osplit ntwo strdup AST Troublereporting th
@c LocalWords: YYSTACK DVI fdl printindex IELR nondeterministic nonterminals ps
@c LocalWords: subexpressions declarator nondeferred config libintl postfix LAC
-@c LocalWords: preprocessor nonpositive unary nonnumeric typedef extern rhs
-@c LocalWords: yytokentype destructor multicharacter nonnull EBCDIC
+@c LocalWords: preprocessor nonpositive unary nonnumeric typedef extern rhs sr
+@c LocalWords: yytokentype destructor multicharacter nonnull EBCDIC nterm LR's
@c LocalWords: lvalue nonnegative XNUM CHR chr TAGLESS tagless stdout api TOK
-@c LocalWords: destructors Reentrancy nonreentrant subgrammar nonassociative
+@c LocalWords: destructors Reentrancy nonreentrant subgrammar nonassociative Ph
@c LocalWords: deffnx namespace xml goto lalr ielr runtime lex yacc yyps env
@c LocalWords: yystate variadic Unshift NLS gettext po UTF Automake LOCALEDIR
@c LocalWords: YYENABLE bindtextdomain Makefile DEFS CPPFLAGS DBISON DeRemer
-@c LocalWords: autoreconf Pennello multisets nondeterminism Generalised baz
+@c LocalWords: autoreconf Pennello multisets nondeterminism Generalised baz ACM
@c LocalWords: redeclare automata Dparse localedir datadir XSLT midrule Wno
-@c LocalWords: Graphviz multitable headitem hh basename Doxygen fno
+@c LocalWords: Graphviz multitable headitem hh basename Doxygen fno filename
@c LocalWords: doxygen ival sval deftypemethod deallocate pos deftypemethodx
@c LocalWords: Ctor defcv defcvx arg accessors arithmetics CPP ifndef CALCXX
@c LocalWords: lexer's calcxx bool LPAREN RPAREN deallocation cerrno climits
@c LocalWords: cstdlib Debian undef yywrap unput noyywrap nounput zA yyleng
-@c LocalWords: errno strtol ERANGE str strerror iostream argc argv Javadoc
+@c LocalWords: errno strtol ERANGE str strerror iostream argc argv Javadoc PSLR
@c LocalWords: bytecode initializers superclass stype ASTNode autoboxing nls
@c LocalWords: toString deftypeivar deftypeivarx deftypeop YYParser strictfp
@c LocalWords: superclasses boolean getErrorVerbose setErrorVerbose deftypecv
@c LocalWords: getDebugStream setDebugStream getDebugLevel setDebugLevel url
@c LocalWords: bisonVersion deftypecvx bisonSkeleton getStartPos getEndPos
-@c LocalWords: getLVal defvar deftypefn deftypefnx gotos msgfmt Corbett
-@c LocalWords: subdirectory Solaris nonassociativity
+@c LocalWords: getLVal defvar deftypefn deftypefnx gotos msgfmt Corbett LALR's
+@c LocalWords: subdirectory Solaris nonassociativity perror schemas Malloy
+@c LocalWords: Scannerless ispell american
@c Local Variables:
@c ispell-dictionary: "american"