This manual (@value{UPDATED}) is for GNU Bison (version
@value{VERSION}), the GNU parser generator.
-Copyright @copyright{} 1988-1993, 1995, 1998-2012 Free Software
+Copyright @copyright{} 1988-1993, 1995, 1998-2013 Free Software
Foundation, Inc.
@quotation
* Grammar Outline:: Overall layout of the grammar file.
* Symbols:: Terminal and nonterminal symbols.
* Rules:: How to write grammar rules.
-* Recursion:: Writing recursive rules.
* Semantics:: Semantic values and actions.
* Tracking Locations:: Locations and actions.
* Named References:: Using named references in actions.
* Grammar Rules:: Syntax and usage of the grammar rules section.
* Epilogue:: Syntax and usage of the epilogue.
+Grammar Rules
+
+* Rules Syntax:: Syntax of the rules.
+* Empty Rules:: Symbols that can match the empty string.
+* Recursion:: Writing recursive rules.
+
+
Defining Language Semantics
* Value Type:: Specifying one data type for all semantic values.
* Multiple Types:: Specifying several alternative data types.
+* Type Generation:: Generating the semantic value type.
+* Union Decl:: Declaring the set of all semantic value types.
+* Structured Value Type:: Providing a structured semantic value type.
* Actions:: An action is the semantic definition of a grammar rule.
* Action Types:: Specifying data types for actions to operate on.
* Mid-Rule Actions:: Most actions go at the end of a rule.
This says when, why and how to use the exceptional
action in the middle of a rule.
+Actions in Mid-Rule
+
+* Using Mid-Rule Actions:: Putting an action in the middle of a rule.
+* Mid-Rule Action Translation:: How mid-rule actions are actually processed.
+* Mid-Rule Conflicts:: Mid-rule actions can cause conflicts.
+
Tracking Locations
* Location Type:: Specifying a data type for locations.
* Require Decl:: Requiring a Bison version.
* Token Decl:: Declaring terminal symbols.
* Precedence Decl:: Declaring terminals with precedence and associativity.
-* Union Decl:: Declaring the set of all semantic value types.
* Type Decl:: Declaring the choice of type for a nonterminal symbol.
* Initial Action Decl:: Code run before parsing starts.
* Destructor Decl:: Declaring how symbols are freed.
* Java Parser Interface:: Instantiating and running the parser
* Java Scanner Interface:: Specifying the scanner for the parser
* Java Action Features:: Special features for use in actions
+* Java Push Parser Interface:: Instantiating and running the a push parser
* Java Differences:: Differences between C/C++ and Java Grammars
* Java Declarations Summary:: List of Bison declarations used with Java
%%
prog:
- /* Nothing. */
+ %empty
| prog stmt @{ printf ("\n"); @}
;
@example
/* Reverse polish notation calculator. */
+@group
%@{
- #define YYSTYPE double
#include <stdio.h>
#include <math.h>
int yylex (void);
void yyerror (char const *);
%@}
+@end group
+%define api.value.type @{double@}
%token NUM
%% /* Grammar rules and actions follow. */
The declarations section (@pxref{Prologue, , The prologue}) contains two
preprocessor directives and two forward declarations.
-The @code{#define} directive defines the macro @code{YYSTYPE}, thus
-specifying the C data type for semantic values of both tokens and
-groupings (@pxref{Value Type, ,Data Types of Semantic Values}). The
-Bison parser will use whatever type @code{YYSTYPE} is defined as; if you
-don't define it, @code{int} is the default. Because we specify
-@code{double}, each token and each expression has an associated value,
-which is a floating point number.
-
The @code{#include} directive is used to declare the exponentiation
function @code{pow}.
epilogue, but the parser calls them so they must be declared in the
prologue.
-The second section, Bison declarations, provides information to Bison
-about the token types (@pxref{Bison Declarations, ,The Bison
-Declarations Section}). Each terminal symbol that is not a
-single-character literal must be declared here. (Single-character
-literals normally don't need to be declared.) In this example, all the
-arithmetic operators are designated by single-character literals, so the
-only terminal symbol that needs to be declared is @code{NUM}, the token
-type for numeric constants.
+The second section, Bison declarations, provides information to Bison about
+the tokens and their types (@pxref{Bison Declarations, ,The Bison
+Declarations Section}).
+
+The @code{%define} directive defines the variable @code{api.value.type},
+thus specifying the C data type for semantic values of both tokens and
+groupings (@pxref{Value Type, ,Data Types of Semantic Values}). The Bison
+parser will use whatever type @code{api.value.type} is defined as; if you
+don't define it, @code{int} is the default. Because we specify
+@samp{@{double@}}, each token and each expression has an associated value,
+which is a floating point number. C code can use @code{YYSTYPE} to refer to
+the value @code{api.value.type}.
+
+Each terminal symbol that is not a single-character literal must be
+declared. (Single-character literals normally don't need to be declared.)
+In this example, all the arithmetic operators are designated by
+single-character literals, so the only terminal symbol that needs to be
+declared is @code{NUM}, the token type for numeric constants.
@node Rpcalc Rules
@subsection Grammar Rules for @code{rpcalc}
@example
@group
input:
- /* empty */
+ %empty
| input line
;
@end group
@example
input:
- /* empty */
+ %empty
| input line
;
@end example
colon and the first @samp{|}; this means that @code{input} can match an
empty string of input (no tokens). We write the rules this way because it
is legitimate to type @kbd{Ctrl-d} right after you start the calculator.
-It's conventional to put an empty alternative first and write the comment
-@samp{/* empty */} in it.
+It's conventional to put an empty alternative first and to use the
+(optional) @code{%empty} directive, or to write the comment @samp{/* empty
+*/} in it (@pxref{Empty Rules}).
The second alternate rule (@code{input line}) handles all nontrivial input.
It means, ``After reading any number of lines, read one more line if
The semantic value of the token (if it has one) is stored into the
global variable @code{yylval}, which is where the Bison parser will look
-for it. (The C data type of @code{yylval} is @code{YYSTYPE}, which was
-defined at the beginning of the grammar; @pxref{Rpcalc Declarations,
-,Declarations for @code{rpcalc}}.)
+for it. (The C data type of @code{yylval} is @code{YYSTYPE}, whose value
+was defined at the beginning of the grammar via @samp{%define api.value.type
+@{double@}}; @pxref{Rpcalc Declarations,,Declarations for @code{rpcalc}}.)
A token type code of zero is returned if the end-of-input is encountered.
(Bison recognizes any nonpositive value as indicating end-of-input.)
@group
%@{
- #define YYSTYPE double
#include <math.h>
#include <stdio.h>
int yylex (void);
@group
/* Bison declarations. */
+%define api.value.type @{double@}
%token NUM
%left '-' '+'
%left '*' '/'
%% /* The grammar follows. */
@group
input:
- /* empty */
+ %empty
| input line
;
@end group
/* Location tracking calculator. */
%@{
- #define YYSTYPE int
#include <math.h>
int yylex (void);
void yyerror (char const *);
%@}
/* Bison declarations. */
+%define api.value.type @{int@}
%token NUM
%left '-' '+'
@example
@group
input:
- /* empty */
+ %empty
| input line
;
@end group
%@{
#include <stdio.h> /* For printf, etc. */
#include <math.h> /* For pow, used in the grammar. */
- #include "calc.h" /* Contains definition of `symrec'. */
+ #include "calc.h" /* Contains definition of 'symrec'. */
int yylex (void);
void yyerror (char const *);
%@}
@end group
-@group
-%union @{
- double val; /* For returning numbers. */
- symrec *tptr; /* For returning symbol-table pointers. */
-@}
-@end group
-%token <val> NUM /* Simple double precision number. */
-%token <tptr> VAR FNCT /* Variable and function. */
-%type <val> exp
+%define api.value.type union /* Generate YYSTYPE from these types: */
+%token <double> NUM /* Simple double precision number. */
+%token <symrec*> VAR FNCT /* Symbol table pointer: variable and function. */
+%type <double> exp
@group
-%right '='
+%precedence '='
%left '-' '+'
%left '*' '/'
%precedence NEG /* negation--unary minus */
These features allow semantic values to have various data types
(@pxref{Multiple Types, ,More Than One Value Type}).
-The @code{%union} declaration specifies the entire list of possible types;
-this is instead of defining @code{YYSTYPE}. The allowable types are now
-double-floats (for @code{exp} and @code{NUM}) and pointers to entries in
-the symbol table. @xref{Union Decl, ,The Collection of Value Types}.
-
-Since values can now have various types, it is necessary to associate a
-type with each grammar symbol whose semantic value is used. These symbols
-are @code{NUM}, @code{VAR}, @code{FNCT}, and @code{exp}. Their
-declarations are augmented with information about their data type (placed
-between angle brackets).
-
-The Bison construct @code{%type} is used for declaring nonterminal
-symbols, just as @code{%token} is used for declaring token types. We
-have not used @code{%type} before because nonterminal symbols are
-normally declared implicitly by the rules that define them. But
-@code{exp} must be declared explicitly so we can specify its value type.
-@xref{Type Decl, ,Nonterminal Symbols}.
+The special @code{union} value assigned to the @code{%define} variable
+@code{api.value.type} specifies that the symbols are defined with their data
+types. Bison will generate an appropriate definition of @code{YYSTYPE} to
+store these values.
+
+Since values can now have various types, it is necessary to associate a type
+with each grammar symbol whose semantic value is used. These symbols are
+@code{NUM}, @code{VAR}, @code{FNCT}, and @code{exp}. Their declarations are
+augmented with their data type (placed between angle brackets). For
+instance, values of @code{NUM} are stored in @code{double}.
+
+The Bison construct @code{%type} is used for declaring nonterminal symbols,
+just as @code{%token} is used for declaring token types. Previously we did
+not use @code{%type} before because nonterminal symbols are normally
+declared implicitly by the rules that define them. But @code{exp} must be
+declared explicitly so we can specify its value type. @xref{Type Decl,
+,Nonterminal Symbols}.
@node Mfcalc Rules
@subsection Grammar Rules for @code{mfcalc}
%% /* The grammar follows. */
@group
input:
- /* empty */
+ %empty
| input line
;
@end group
@group
typedef struct symrec symrec;
-/* The symbol table: a chain of `struct symrec'. */
+/* The symbol table: a chain of 'struct symrec'. */
extern symrec *sym_table;
symrec *putsym (char const *, int);
@end group
@group
-/* The symbol table: a chain of `struct symrec'. */
+/* The symbol table: a chain of 'struct symrec'. */
symrec *sym_table;
@end group
if (c == '.' || isdigit (c))
@{
ungetc (c, stdin);
- scanf ("%lf", &yylval.val);
+ scanf ("%lf", &yylval.NUM);
return NUM;
@}
@end group
+@end example
+
+@noindent
+Bison generated a definition of @code{YYSTYPE} with a member named
+@code{NUM} to store value of @code{NUM} symbols.
+@comment file: mfcalc.y: 3
+@example
@group
/* Char starts an identifier => read the name. */
if (isalpha (c))
s = getsym (symbuf);
if (s == 0)
s = putsym (symbuf, VAR);
- yylval.tptr = s;
+ *((symrec**) &yylval) = s;
return s->type;
@}
* Grammar Outline:: Overall layout of the grammar file.
* Symbols:: Terminal and nonterminal symbols.
* Rules:: How to write grammar rules.
-* Recursion:: Writing recursive rules.
* Semantics:: Semantic values and actions.
* Tracking Locations:: Locations and actions.
* Named References:: Using named references in actions.
@node Grammar Outline
@section Outline of a Bison Grammar
+@cindex comment
+@findex // @dots{}
+@findex /* @dots{} */
A Bison grammar file has four main sections, shown here with the
appropriate delimiters:
@end example
Comments enclosed in @samp{/* @dots{} */} may appear in any of the sections.
-As a GNU extension, @samp{//} introduces a comment that
-continues until end of line.
+As a GNU extension, @samp{//} introduces a comment that continues until end
+of line.
@menu
* Prologue:: Syntax and usage of the prologue.
@code{%union} declaration.
@example
+@group
%@{
#define _GNU_SOURCE
#include <stdio.h>
#include "ptypes.h"
%@}
+@end group
+@group
%union @{
long int n;
tree t; /* @r{@code{tree} is defined in @file{ptypes.h}.} */
@}
+@end group
+@group
%@{
static void print_token_value (FILE *, int, YYSTYPE);
#define YYPRINT(F, N, L) print_token_value (F, N, L)
%@}
+@end group
@dots{}
@end example
Look again at the example of the previous section:
@example
+@group
%@{
#define _GNU_SOURCE
#include <stdio.h>
#include "ptypes.h"
%@}
+@end group
+@group
%union @{
long int n;
tree t; /* @r{@code{tree} is defined in @file{ptypes.h}.} */
@}
+@end group
+@group
%@{
static void print_token_value (FILE *, int, YYSTYPE);
#define YYPRINT(F, N, L) print_token_value (F, N, L)
%@}
+@end group
@dots{}
@end example
#include <stdio.h>
/* WARNING: The following code really belongs
- * in a `%code requires'; see below. */
+ * in a '%code requires'; see below. */
#include "ptypes.h"
#define YYLTYPE YYLTYPE
@} YYLTYPE;
@}
+@group
%union @{
long int n;
tree t; /* @r{@code{tree} is defined in @file{ptypes.h}.} */
@}
+@end group
+@group
%code @{
static void print_token_value (FILE *, int, YYSTYPE);
#define YYPRINT(F, N, L) print_token_value (F, N, L)
static void trace_token (enum yytokentype token, YYLTYPE loc);
@}
+@end group
@dots{}
@end example
one of your tokens with a @code{%token} declaration.
@node Rules
-@section Syntax of Grammar Rules
+@section Grammar Rules
+
+A Bison grammar is a list of rules.
+
+@menu
+* Rules Syntax:: Syntax of the rules.
+* Empty Rules:: Symbols that can match the empty string.
+* Recursion:: Writing recursive rules.
+@end menu
+
+@node Rules Syntax
+@subsection Syntax of Grammar Rules
@cindex rule syntax
@cindex grammar rule syntax
@cindex syntax of grammar rules
@noindent
They are still considered distinct rules even when joined in this way.
-If @var{components} in a rule is empty, it means that @var{result} can
-match the empty string. For example, here is how to define a
-comma-separated sequence of zero or more @code{exp} groupings:
+@node Empty Rules
+@subsection Empty Rules
+@cindex empty rule
+@cindex rule, empty
+@findex %empty
+
+A rule is said to be @dfn{empty} if its right-hand side (@var{components})
+is empty. It means that @var{result} can match the empty string. For
+example, here is how to define an optional semicolon:
+
+@example
+semicolon.opt: | ";";
+@end example
+
+@noindent
+It is easy not to see an empty rule, especially when @code{|} is used. The
+@code{%empty} directive allows to make explicit that a rule is empty on
+purpose:
@example
@group
-expseq:
- /* empty */
-| expseq1
+semicolon.opt:
+ %empty
+| ";"
;
@end group
+@end example
+
+Flagging a non-empty rule with @code{%empty} is an error. If run with
+@option{-Wempty-rule}, @command{bison} will report empty rules without
+@code{%empty}. Using @code{%empty} enables this warning, unless
+@option{-Wno-empty-rule} was specified.
+
+The @code{%empty} directive is a Bison extension, it does not work with
+Yacc. To remain compatible with POSIX Yacc, it is customary to write a
+comment @samp{/* empty */} in each rule with no components:
+@example
@group
-expseq1:
- exp
-| expseq1 ',' exp
+semicolon.opt:
+ /* empty */
+| ";"
;
@end group
@end example
-@noindent
-It is customary to write a comment @samp{/* empty */} in each rule
-with no components.
@node Recursion
-@section Recursive Rules
+@subsection Recursive Rules
@cindex recursive rule
+@cindex rule, recursive
A rule is called @dfn{recursive} when its @var{result} nonterminal
appears also on its right hand side. Nearly all Bison grammars need to
@menu
* Value Type:: Specifying one data type for all semantic values.
* Multiple Types:: Specifying several alternative data types.
+* Type Generation:: Generating the semantic value type.
+* Union Decl:: Declaring the set of all semantic value types.
+* Structured Value Type:: Providing a structured semantic value type.
* Actions:: An action is the semantic definition of a grammar rule.
* Action Types:: Specifying data types for actions to operate on.
* Mid-Rule Actions:: Most actions go at the end of a rule.
Bison normally uses the type @code{int} for semantic values if your
program uses the same data type for all language constructs. To
-specify some other type, define @code{YYSTYPE} as a macro, like this:
+specify some other type, define the @code{%define} variable
+@code{api.value.type} like this:
+
+@example
+%define api.value.type @{double@}
+@end example
+
+@noindent
+or
+
+@example
+%define api.value.type @{struct semantic_type@}
+@end example
+
+The value of @code{api.value.type} should be a type name that does not
+contain parentheses or square brackets.
+
+Alternatively, instead of relying of Bison's @code{%define} support, you may
+rely on the C/C++ preprocessor and define @code{YYSTYPE} as a macro, like
+this:
@example
#define YYSTYPE double
@end example
@noindent
-@code{YYSTYPE}'s replacement list should be a type name
-that does not contain parentheses or square brackets.
This macro definition must go in the prologue of the grammar file
-(@pxref{Grammar Outline, ,Outline of a Bison Grammar}).
+(@pxref{Grammar Outline, ,Outline of a Bison Grammar}). If compatibility
+with POSIX Yacc matters to you, use this. Note however that Bison cannot
+know @code{YYSTYPE}'s value, not even whether it is defined, so there are
+services it cannot provide. Besides this works only for languages that have
+a preprocessor.
@node Multiple Types
@subsection More Than One Value Type
@itemize @bullet
@item
-Specify the entire collection of possible data types, either by using the
-@code{%union} Bison declaration (@pxref{Union Decl, ,The Collection of
-Value Types}), or by using a @code{typedef} or a @code{#define} to
-define @code{YYSTYPE} to be a union type whose member names are
-the type tags.
+Specify the entire collection of possible data types. There are several
+options:
+@itemize @bullet
+@item
+let Bison compute the union type from the tags you assign to symbols;
+
+@item
+use the @code{%union} Bison declaration (@pxref{Union Decl, ,The Union
+Declaration});
+
+@item
+define the @code{%define} variable @code{api.value.type} to be a union type
+whose members are the type tags (@pxref{Structured Value Type,, Providing a
+Structured Semantic Value Type});
+
+@item
+use a @code{typedef} or a @code{#define} to define @code{YYSTYPE} to be a
+union type whose member names are the type tags.
+@end itemize
@item
Choose one of those types for each symbol (terminal or nonterminal) for
Decl, ,Nonterminal Symbols}).
@end itemize
+@node Type Generation
+@subsection Generating the Semantic Value Type
+@cindex declaring value types
+@cindex value types, declaring
+@findex %define api.value.type union
+
+The special value @code{union} of the @code{%define} variable
+@code{api.value.type} instructs Bison that the tags used with the
+@code{%token} and @code{%type} directives are genuine types, not names of
+members of @code{YYSTYPE}.
+
+For example:
+
+@example
+%define api.value.type union
+%token <int> INT "integer"
+%token <int> 'n'
+%type <int> expr
+%token <char const *> ID "identifier"
+@end example
+
+@noindent
+generates an appropriate value of @code{YYSTYPE} to support each symbol
+type. The name of the member of @code{YYSTYPE} for tokens than have a
+declared identifier @var{id} (such as @code{INT} and @code{ID} above, but
+not @code{'n'}) is @code{@var{id}}. The other symbols have unspecified
+names on which you should not depend; instead, relying on C casts to access
+the semantic value with the appropriate type:
+
+@example
+/* For an "integer". */
+yylval.INT = 42;
+return INT;
+
+/* For an 'n', also declared as int. */
+*((int*)&yylval) = 42;
+return 'n';
+
+/* For an "identifier". */
+yylval.ID = "42";
+return ID;
+@end example
+
+If the @code{%define} variable @code{api.token.prefix} is defined
+(@pxref{%define Summary,,api.token.prefix}), then it is also used to prefix
+the union member names. For instance, with @samp{%define api.token.prefix
+@{TOK_@}}:
+
+@example
+/* For an "integer". */
+yylval.TOK_INT = 42;
+return TOK_INT;
+@end example
+
+This Bison extension cannot work if @code{%yacc} (or
+@option{-y}/@option{--yacc}) is enabled, as POSIX mandates that Yacc
+generate tokens as macros (e.g., @samp{#define INT 258}, or @samp{#define
+TOK_INT 258}).
+
+This feature is new, and user feedback would be most welcome.
+
+A similar feature is provided for C++ that in addition overcomes C++
+limitations (that forbid non-trivial objects to be part of a @code{union}):
+@samp{%define api.value.type variant}, see @ref{C++ Variants}.
+
+@node Union Decl
+@subsection The Union Declaration
+@cindex declaring value types
+@cindex value types, declaring
+@findex %union
+
+The @code{%union} declaration specifies the entire collection of possible
+data types for semantic values. The keyword @code{%union} is followed by
+braced code containing the same thing that goes inside a @code{union} in C@.
+
+For example:
+
+@example
+@group
+%union @{
+ double val;
+ symrec *tptr;
+@}
+@end group
+@end example
+
+@noindent
+This says that the two alternative types are @code{double} and @code{symrec
+*}. They are given names @code{val} and @code{tptr}; these names are used
+in the @code{%token} and @code{%type} declarations to pick one of the types
+for a terminal or nonterminal symbol (@pxref{Type Decl, ,Nonterminal Symbols}).
+
+As an extension to POSIX, a tag is allowed after the @code{%union}. For
+example:
+
+@example
+@group
+%union value @{
+ double val;
+ symrec *tptr;
+@}
+@end group
+@end example
+
+@noindent
+specifies the union tag @code{value}, so the corresponding C type is
+@code{union value}. If you do not specify a tag, it defaults to
+@code{YYSTYPE}.
+
+As another extension to POSIX, you may specify multiple @code{%union}
+declarations; their contents are concatenated. However, only the first
+@code{%union} declaration can specify a tag.
+
+Note that, unlike making a @code{union} declaration in C, you need not write
+a semicolon after the closing brace.
+
+@node Structured Value Type
+@subsection Providing a Structured Semantic Value Type
+@cindex declaring value types
+@cindex value types, declaring
+@findex %union
+
+Instead of @code{%union}, you can define and use your own union type
+@code{YYSTYPE} if your grammar contains at least one @samp{<@var{type}>}
+tag. For example, you can put the following into a header file
+@file{parser.h}:
+
+@example
+@group
+union YYSTYPE @{
+ double val;
+ symrec *tptr;
+@};
+@end group
+@end example
+
+@noindent
+and then your grammar can use the following instead of @code{%union}:
+
+@example
+@group
+%@{
+#include "parser.h"
+%@}
+%define api.value.type @{union YYSTYPE@}
+%type <val> expr
+%token <tptr> ID
+@end group
+@end example
+
+Actually, you may also provide a @code{struct} rather that a @code{union},
+which may be handy if you want to track information for every symbol (such
+as preceding comments).
+
+The type you provide may even be structured and include pointers, in which
+case the type tags you provide may be composite, with @samp{.} and @samp{->}
+operators.
+
@node Actions
@subsection Actions
@cindex action
@group
bar:
- /* empty */ @{ previous_expr = $0; @}
+ %empty @{ previous_expr = $0; @}
;
@end group
@end example
These actions are written just like usual end-of-rule actions, but they
are executed before the parser even recognizes the following components.
+@menu
+* Using Mid-Rule Actions:: Putting an action in the middle of a rule.
+* Mid-Rule Action Translation:: How mid-rule actions are actually processed.
+* Mid-Rule Conflicts:: Mid-rule actions can cause conflicts.
+@end menu
+
+@node Using Mid-Rule Actions
+@subsubsection Using Mid-Rule Actions
+
A mid-rule action may refer to the components preceding it using
@code{$@var{n}}, but it may not refer to subsequent components because
it is run before they are parsed.
@example
@group
stmt:
- LET '(' var ')'
- @{ $<context>$ = push_context (); declare_variable ($3); @}
+ "let" '(' var ')'
+ @{
+ $<context>$ = push_context ();
+ declare_variable ($3);
+ @}
stmt
- @{ $$ = $6; pop_context ($<context>5); @}
+ @{
+ $$ = $6;
+ pop_context ($<context>5);
+ @}
@end group
@end example
@code{context} in the data-type union. Then it calls
@code{declare_variable} to add the new variable to that list. Once the
first action is finished, the embedded statement @code{stmt} can be
-parsed. Note that the mid-rule action is component number 5, so the
-@samp{stmt} is component number 6.
+parsed.
+
+Note that the mid-rule action is component number 5, so the @samp{stmt} is
+component number 6. Named references can be used to improve the readability
+and maintainability (@pxref{Named References}):
+
+@example
+@group
+stmt:
+ "let" '(' var ')'
+ @{
+ $<context>let = push_context ();
+ declare_variable ($3);
+ @}[let]
+ stmt
+ @{
+ $$ = $6;
+ pop_context ($<context>let);
+ @}
+@end group
+@end example
After the embedded statement is parsed, its semantic value becomes the
value of the entire @code{let}-statement. Then the semantic value from the
@group
%type <context> let
%destructor @{ pop_context ($$); @} let
+@end group
%%
+@group
stmt:
let stmt
@{
$$ = $2;
- pop_context ($1);
+ pop_context ($let);
@};
+@end group
+@group
let:
- LET '(' var ')'
+ "let" '(' var ')'
@{
- $$ = push_context ();
+ $let = push_context ();
declare_variable ($3);
@};
Any mid-rule action can be converted to an end-of-rule action in this way, and
this is what Bison actually does to implement mid-rule actions.
+@node Mid-Rule Action Translation
+@subsubsection Mid-Rule Action Translation
+@vindex $@@@var{n}
+@vindex @@@var{n}
+
+As hinted earlier, mid-rule actions are actually transformed into regular
+rules and actions. The various reports generated by Bison (textual,
+graphical, etc., see @ref{Understanding, , Understanding Your Parser})
+reveal this translation, best explained by means of an example. The
+following rule:
+
+@example
+exp: @{ a(); @} "b" @{ c(); @} @{ d(); @} "e" @{ f(); @};
+@end example
+
+@noindent
+is translated into:
+
+@example
+$@@1: %empty @{ a(); @};
+$@@2: %empty @{ c(); @};
+$@@3: %empty @{ d(); @};
+exp: $@@1 "b" $@@2 $@@3 "e" @{ f(); @};
+@end example
+
+@noindent
+with new nonterminal symbols @code{$@@@var{n}}, where @var{n} is a number.
+
+A mid-rule action is expected to generate a value if it uses @code{$$}, or
+the (final) action uses @code{$@var{n}} where @var{n} denote the mid-rule
+action. In that case its nonterminal is rather named @code{@@@var{n}}:
+
+@example
+exp: @{ a(); @} "b" @{ $$ = c(); @} @{ d(); @} "e" @{ f = $1; @};
+@end example
+
+@noindent
+is translated into
+
+@example
+@@1: %empty @{ a(); @};
+@@2: %empty @{ $$ = c(); @};
+$@@3: %empty @{ d(); @};
+exp: @@1 "b" @@2 $@@3 "e" @{ f = $1; @}
+@end example
+
+There are probably two errors in the above example: the first mid-rule
+action does not generate a value (it does not use @code{$$} although the
+final action uses it), and the value of the second one is not used (the
+final action does not use @code{$3}). Bison reports these errors when the
+@code{midrule-value} warnings are enabled (@pxref{Invocation, ,Invoking
+Bison}):
+
+@example
+$ bison -fcaret -Wmidrule-value mid.y
+@group
+mid.y:2.6-13: warning: unset value: $$
+ exp: @{ a(); @} "b" @{ $$ = c(); @} @{ d(); @} "e" @{ f = $1; @};
+ ^^^^^^^^
+@end group
+@group
+mid.y:2.19-31: warning: unused value: $3
+ exp: @{ a(); @} "b" @{ $$ = c(); @} @{ d(); @} "e" @{ f = $1; @};
+ ^^^^^^^^^^^^^
+@end group
+@end example
+
+
+@node Mid-Rule Conflicts
+@subsubsection Conflicts due to Mid-Rule Actions
Taking action before a rule is completely recognized often leads to
conflicts since the parser must commit to a parse in order to execute the
action. For example, the following two rules, without mid-rule actions,
@example
@group
subroutine:
- /* empty */ @{ prepare_for_local_variables (); @}
+ %empty @{ prepare_for_local_variables (); @}
;
@end group
Now Bison can execute the action in the rule for @code{subroutine} without
deciding which rule for @code{compound} it will eventually use.
+
@node Tracking Locations
@section Tracking Locations
@cindex location
else
@{
$$ = 1;
- fprintf (stderr,
- "Division by zero, l%d,c%d-l%d,c%d",
+ fprintf (stderr, "%d.%d-%d.%d: division by zero",
@@3.first_line, @@3.first_column,
@@3.last_line, @@3.last_column);
@}
else
@{
$$ = 1;
- fprintf (stderr,
- "Division by zero, l%d,c%d-l%d,c%d",
+ fprintf (stderr, "%d.%d-%d.%d: division by zero",
@@3.first_line, @@3.first_column,
@@3.last_line, @@3.last_column);
@}
* Require Decl:: Requiring a Bison version.
* Token Decl:: Declaring terminal symbols.
* Precedence Decl:: Declaring terminals with precedence and associativity.
-* Union Decl:: Declaring the set of all semantic value types.
* Type Decl:: Declaring the choice of type for a nonterminal symbol.
* Initial Action Decl:: Code run before parsing starts.
* Destructor Decl:: Declaring how symbols are freed.
%left OR 134 "<=" 135 // Declares 134 for OR and 135 for "<=".
@end example
-@node Union Decl
-@subsection The Collection of Value Types
-@cindex declaring value types
-@cindex value types, declaring
-@findex %union
-
-The @code{%union} declaration specifies the entire collection of
-possible data types for semantic values. The keyword @code{%union} is
-followed by braced code containing the same thing that goes inside a
-@code{union} in C@.
-
-For example:
-
-@example
-@group
-%union @{
- double val;
- symrec *tptr;
-@}
-@end group
-@end example
-
-@noindent
-This says that the two alternative types are @code{double} and @code{symrec
-*}. They are given names @code{val} and @code{tptr}; these names are used
-in the @code{%token} and @code{%type} declarations to pick one of the types
-for a terminal or nonterminal symbol (@pxref{Type Decl, ,Nonterminal Symbols}).
-
-As an extension to POSIX, a tag is allowed after the
-@code{union}. For example:
-
-@example
-@group
-%union value @{
- double val;
- symrec *tptr;
-@}
-@end group
-@end example
-
-@noindent
-specifies the union tag @code{value}, so the corresponding C type is
-@code{union value}. If you do not specify a tag, it defaults to
-@code{YYSTYPE}.
-
-As another extension to POSIX, you may specify multiple
-@code{%union} declarations; their contents are concatenated. However,
-only the first @code{%union} declaration can specify a tag.
-
-Note that, unlike making a @code{union} declaration in C, you need not write
-a semicolon after the closing brace.
-
-Instead of @code{%union}, you can define and use your own union type
-@code{YYSTYPE} if your grammar contains at least one
-@samp{<@var{type}>} tag. For example, you can put the following into
-a header file @file{parser.h}:
-
-@example
-@group
-union YYSTYPE @{
- double val;
- symrec *tptr;
-@};
-typedef union YYSTYPE YYSTYPE;
-@end group
-@end example
-
-@noindent
-and then your grammar can use the following
-instead of @code{%union}:
-
-@example
-@group
-%@{
-#include "parser.h"
-%@}
-%type <val> expr
-%token <tptr> ID
-@end group
-@end example
-
@node Type Decl
@subsection Nonterminal Symbols
@cindex declaring value types, nonterminals
@noindent
Here @var{nonterminal} is the name of a nonterminal symbol, and
@var{type} is the name given in the @code{%union} to the alternative
-that you want (@pxref{Union Decl, ,The Collection of Value Types}). You
+that you want (@pxref{Union Decl, ,The Union Declaration}). You
can give any number of nonterminal symbols in the same @code{%type}
declaration, if they have the same value type. Use spaces to separate
the symbol names.
@example
%union @{ char *string; @}
-%token <string> STRING1
-%token <string> STRING2
-%type <string> string1
-%type <string> string2
+%token <string> STRING1 STRING2
+%type <string> string1 string2
%union @{ char character; @}
%token <character> CHR
%type <character> chr
@example
%union @{ char *string; @}
-%token <string> STRING1
-%token <string> STRING2
-%type <string> string1
-%type <string> string2
+%token <string> STRING1 STRING2
+%type <string> string1 string2
%union @{ char character; @}
%token <character> CHR
%type <character> chr
reentrant. It looks like this:
@example
-%define api.pure
+%define api.pure full
@end example
The result is that the communication variables @code{yylval} and
@code{yylex}. @xref{Pure Calling, ,Calling Conventions for Pure
Parsers}, for the details of this. The variable @code{yynerrs}
becomes local in @code{yyparse} in pull mode but it becomes a member
-of yypstate in push mode. (@pxref{Error Reporting, ,The Error
+of @code{yypstate} in push mode. (@pxref{Error Reporting, ,The Error
Reporting Function @code{yyerror}}). The convention for calling
@code{yyparse} itself is unchanged.
what you are doing, your declarations should look like this:
@example
-%define api.pure
+%define api.pure full
%define api.push-pull push
@end example
@deffn {Directive} %union
Declare the collection of data types that semantic values may have
-(@pxref{Union Decl, ,The Collection of Value Types}).
+(@pxref{Union Decl, ,The Union Declaration}).
@end deffn
@deffn {Directive} %token
@deffn {Directive} %define @var{variable}
@deffnx {Directive} %define @var{variable} @var{value}
+@deffnx {Directive} %define @var{variable} @{@var{value}@}
@deffnx {Directive} %define @var{variable} "@var{value}"
Define a variable to adjust Bison's behavior. @xref{%define Summary}.
@end deffn
uppercase, with each series of non alphanumerical characters converted to a
single underscore.
-For instance with @samp{%define api.prefix "calc"} and @samp{%defines
+For instance with @samp{%define api.prefix @{calc@}} and @samp{%defines
"lib/parse.h"}, the header will be guarded as follows.
@example
#ifndef YY_CALC_LIB_PARSE_H_INCLUDED
@end deffn
@deffn {Directive} %defines @var{defines-file}
-Same as above, but save in the file @var{defines-file}.
+Same as above, but save in the file @file{@var{defines-file}}.
@end deffn
@deffn {Directive} %destructor
supported languages include C, C++, and Java.
@var{language} is case-insensitive.
-This directive is experimental and its effect may be modified in future
-releases.
@end deffn
@deffn {Directive} %locations
@end deffn
@deffn {Directive} %output "@var{file}"
-Specify @var{file} for the parser implementation file.
+Generate the parser implementation in @file{@var{file}}.
@end deffn
@deffn {Directive} %pure-parser
@deffn {Directive} %define @var{variable}
@deffnx {Directive} %define @var{variable} @var{value}
+@deffnx {Directive} %define @var{variable} @{@var{value}@}
@deffnx {Directive} %define @var{variable} "@var{value}"
Define @var{variable} to @var{value}.
-@var{value} must be placed in quotation marks if it contains any
-character other than a letter, underscore, period, or non-initial dash
-or digit. Omitting @code{"@var{value}"} entirely is always equivalent
-to specifying @code{""}.
+The type of the values depend on the syntax. Braces denote value in the
+target language (e.g., a namespace, a type, etc.). Keyword values (no
+delimiters) denote finite choice (e.g., a variation of a feature). String
+values denote remaining cases (e.g., a file name).
-It is an error if a @var{variable} is defined by @code{%define}
-multiple times, but see @ref{Bison Options,,-D
-@var{name}[=@var{value}]}.
+It is an error if a @var{variable} is defined by @code{%define} multiple
+times, but see @ref{Bison Options,,-D @var{name}[=@var{value}]}.
@end deffn
The rest of this section summarizes variables and values that
skeleton (@pxref{Decl Summary,,%language}, @pxref{Decl
Summary,,%skeleton}).
Unaccepted @var{variable}s produce an error.
-Some of the accepted @var{variable}s are:
+Some of the accepted @var{variable}s are described below.
-@table @code
@c ================================================== api.namespace
-@item api.namespace
-@findex %define api.namespace
+@deffn Directive {%define api.namespace} @{@var{namespace}@}
@itemize
@item Languages(s): C++
For example, if you specify:
@example
-%define api.namespace "foo::bar"
+%define api.namespace @{foo::bar@}
@end example
Bison uses @code{foo::bar} verbatim in references such as:
lexical analyzer function. For example, if you specify:
@example
-%define api.namespace "foo"
+%define api.namespace @{foo@}
%name-prefix "bar::"
@end example
The parser namespace is @code{foo} and @code{yylex} is referenced as
@code{bar::lex}.
@end itemize
-@c namespace
+@end deffn
+@c api.namespace
@c ================================================== api.location.type
-@item @code{api.location.type}
-@findex %define api.location.type
+@deffn {Directive} {%define api.location.type} @{@var{type}@}
@itemize @bullet
@item Language(s): C++, Java
@item Default Value: none
-@item History: introduced in Bison 2.7
+@item History:
+Introduced in Bison 2.7 for C, C++ and Java. Introduced under the name
+@code{location_type} for C++ in Bison 2.5 and for Java in Bison 2.4.
@end itemize
+@end deffn
@c ================================================== api.prefix
-@item api.prefix
-@findex %define api.prefix
+@deffn {Directive} {%define api.prefix} @{@var{prefix}@}
@itemize @bullet
@item Language(s): All
@item History: introduced in Bison 2.6
@end itemize
+@end deffn
@c ================================================== api.pure
-@item api.pure
-@findex %define api.pure
+@deffn Directive {%define api.pure} @var{purity}
@itemize @bullet
@item Language(s): C
@item Purpose: Request a pure (reentrant) parser program.
@xref{Pure Decl, ,A Pure (Reentrant) Parser}.
-@item Accepted Values: Boolean
+@item Accepted Values: @code{true}, @code{false}, @code{full}
+
+The value may be omitted: this is equivalent to specifying @code{true}, as is
+the case for Boolean values.
+
+When @code{%define api.pure full} is used, the parser is made reentrant. This
+changes the signature for @code{yylex} (@pxref{Pure Calling}), and also that of
+@code{yyerror} when the tracking of locations has been activated, as shown
+below.
+
+The @code{true} value is very similar to the @code{full} value, the only
+difference is in the signature of @code{yyerror} on Yacc parsers without
+@code{%parse-param}, for historical reasons.
+
+I.e., if @samp{%locations %define api.pure} is passed then the prototypes for
+@code{yyerror} are:
+
+@example
+void yyerror (char const *msg); // Yacc parsers.
+void yyerror (YYLTYPE *locp, char const *msg); // GLR parsers.
+@end example
+
+But if @samp{%locations %define api.pure %parse-param @{int *nastiness@}} is
+used, then both parsers have the same signature:
+
+@example
+void yyerror (YYLTYPE *llocp, int *nastiness, char const *msg);
+@end example
+
+(@pxref{Error Reporting, ,The Error
+Reporting Function @code{yyerror}})
@item Default Value: @code{false}
+
+@item History:
+the @code{full} value was introduced in Bison 2.7
@end itemize
+@end deffn
@c api.pure
@c ================================================== api.push-pull
-@item api.push-pull
-@findex %define api.push-pull
+@deffn Directive {%define api.push-pull} @var{kind}
@itemize @bullet
@item Language(s): C (deterministic parsers only)
@item Default Value: @code{pull}
@end itemize
+@end deffn
@c api.push-pull
@c ================================================== api.token.constructor
-@item api.token.constructor
-@findex %define api.token.constructor
+@deffn Directive {%define api.token.constructor}
@itemize @bullet
@item Language(s):
@item Default Value:
@code{false}
@item History:
-introduced in Bison 2.8
+introduced in Bison 3.0
@end itemize
+@end deffn
@c api.token.constructor
@c ================================================== api.token.prefix
-@item api.token.prefix
-@findex %define api.token.prefix
+@deffn Directive {%define api.token.prefix} @{@var{prefix}@}
@itemize
@item Languages(s): all
@example
%token FILE for ERROR
-%define api.token.prefix "TOK_"
+%define api.token.prefix @{TOK_@}
%%
start: FILE for ERROR;
@end example
scanner must use these prefixed token names, while the grammar itself
may still use the short names (as in the sample rule given above). The
generated informational files (@file{*.output}, @file{*.xml},
-@file{*.dot}) are not modified by this prefix. See @ref{Calc++ Parser}
-and @ref{Calc++ Scanner}, for a complete example.
+@file{*.dot}) are not modified by this prefix.
+
+Bison also prefixes the generated member names of the semantic value union.
+@xref{Type Generation,, Generating the Semantic Value Type}, for more
+details.
+
+See @ref{Calc++ Parser} and @ref{Calc++ Scanner}, for a complete example.
@item Accepted Values:
Any string. Should be a valid identifier prefix in the target language,
@item Default Value:
empty
@item History:
-introduced in Bison 2.8
+introduced in Bison 3.0
@end itemize
+@end deffn
@c api.token.prefix
+@c ================================================== api.value.type
+@deffn Directive {%define api.value.type} @var{support}
+@deffnx Directive {%define api.value.type} @{@var{type}@}
+@itemize @bullet
+@item Language(s):
+all
+
+@item Purpose:
+The type for semantic values.
+
+@item Accepted Values:
+@table @asis
+@item @samp{@{@}}
+This grammar has no semantic value at all. This is not properly supported
+yet.
+@item @samp{union-directive} (C, C++)
+The type is defined thanks to the @code{%union} directive. You don't have
+to define @code{api.value.type} in that case, using @code{%union} suffices.
+@xref{Union Decl, ,The Union Declaration}.
+For instance:
+@example
+%define api.value.type union-directive
+%union
+@{
+ int ival;
+ char *sval;
+@}
+%token <ival> INT "integer"
+%token <sval> STR "string"
+@end example
+
+@item @samp{union} (C, C++)
+The symbols are defined with type names, from which Bison will generate a
+@code{union}. For instance:
+@example
+%define api.value.type union
+%token <int> INT "integer"
+%token <char *> STR "string"
+@end example
+This feature needs user feedback to stabilize. Note that most C++ objects
+cannot be stored in a @code{union}.
+
+@item @samp{variant} (C++)
+This is similar to @code{union}, but special storage techniques are used to
+allow any kind of C++ object to be used. For instance:
+@example
+%define api.value.type variant
+%token <int> INT "integer"
+%token <std::string> STR "string"
+@end example
+This feature needs user feedback to stabilize.
+@xref{C++ Variants}.
+
+@item @samp{@{@var{type}@}}
+Use this @var{type} as semantic value.
+@example
+%code requires
+@{
+ struct my_value
+ @{
+ enum
+ @{
+ is_int, is_str
+ @} kind;
+ union
+ @{
+ int ival;
+ char *sval;
+ @} u;
+ @};
+@}
+%define api.value.type @{struct my_value@}
+%token <u.ival> INT "integer"
+%token <u.sval> STR "string"
+@end example
+@end table
+
+@item Default Value:
+@itemize @minus
+@item
+@code{%union} if @code{%union} is used, otherwise @dots{}
+@item
+@code{int} if type tags are used (i.e., @samp{%token <@var{type}>@dots{}} or
+@samp{%token <@var{type}>@dots{}} is used), otherwise @dots{}
+@item
+@code{""}
+@end itemize
+
+@item History:
+introduced in Bison 3.0. Was introduced for Java only in 2.3b as
+@code{stype}.
+@end itemize
+@end deffn
+@c api.value.type
+
+
+@c ================================================== location_type
+@deffn Directive {%define location_type}
+Obsoleted by @code{api.location.type} since Bison 2.7.
+@end deffn
+
+
@c ================================================== lr.default-reduction
-@item lr.default-reduction
-@findex %define lr.default-reduction
+@deffn Directive {%define lr.default-reduction} @var{when}
@itemize @bullet
@item Language(s): all
@item @code{most} otherwise.
@end itemize
@item History:
-introduced as @code{lr.default-reduction} in 2.5, renamed as
-@code{lr.default-reduction} in 2.8.
+introduced as @code{lr.default-reductions} in 2.5, renamed as
+@code{lr.default-reduction} in 3.0.
@end itemize
+@end deffn
@c ============================================ lr.keep-unreachable-state
-@item lr.keep-unreachable-state
-@findex %define lr.keep-unreachable-state
+@deffn Directive {%define lr.keep-unreachable-state}
@itemize @bullet
@item Language(s): all
remain in the parser tables. @xref{Unreachable States}.
@item Accepted Values: Boolean
@item Default Value: @code{false}
-@end itemize
+@item History:
introduced as @code{lr.keep_unreachable_states} in 2.3b, renamed as
-@code{lr.keep-unreachable-state} in 2.5, and as
-@code{lr.keep-unreachable-state} in 2.8.
+@code{lr.keep-unreachable-states} in 2.5, and as
+@code{lr.keep-unreachable-state} in 3.0.
+@end itemize
+@end deffn
@c lr.keep-unreachable-state
@c ================================================== lr.type
-@item lr.type
-@findex %define lr.type
+@deffn Directive {%define lr.type} @var{type}
@itemize @bullet
@item Language(s): all
@item Default Value: @code{lalr}
@end itemize
-
+@end deffn
@c ================================================== namespace
-@item namespace
-@findex %define namespace
+@deffn Directive %define namespace @{@var{namespace}@}
Obsoleted by @code{api.namespace}
@c namespace
-
+@end deffn
@c ================================================== parse.assert
-@item parse.assert
-@findex %define parse.assert
+@deffn Directive {%define parse.assert}
@itemize
@item Languages(s): C++
@item Default Value: @code{false}
@end itemize
+@end deffn
@c parse.assert
@c ================================================== parse.error
-@item parse.error
-@findex %define parse.error
+@deffn Directive {%define parse.error} @var{verbosity}
@itemize
@item Languages(s):
all
@item Default Value:
@code{simple}
@end itemize
+@end deffn
@c parse.error
@c ================================================== parse.lac
-@item parse.lac
-@findex %define parse.lac
+@deffn Directive {%define parse.lac} @var{when}
@itemize
@item Languages(s): C (deterministic parsers only)
@item Accepted Values: @code{none}, @code{full}
@item Default Value: @code{none}
@end itemize
+@end deffn
@c parse.lac
@c ================================================== parse.trace
-@item parse.trace
-@findex %define parse.trace
+@deffn Directive {%define parse.trace}
@itemize
@item Languages(s): C, C++, Java
@xref{Tracing, ,Tracing Your Parser}.
In C/C++, define the macro @code{YYDEBUG} (or @code{@var{prefix}DEBUG} with
-@samp{%define api.prefix @var{prefix}}), see @ref{Multiple Parsers,
+@samp{%define api.prefix @{@var{prefix}@}}), see @ref{Multiple Parsers,
,Multiple Parsers in the Same Program}) to 1 in the parser implementation
file if it is not already defined, so that the debugging facilities are
compiled.
@item Default Value: @code{false}
@end itemize
+@end deffn
@c parse.trace
-@c ================================================== variant
-@item variant
-@findex %define variant
-
-@itemize @bullet
-@item Language(s):
-C++
-
-@item Purpose:
-Request variant-based semantic values.
-@xref{C++ Variants}.
-
-@item Accepted Values:
-Boolean.
-
-@item Default Value:
-@code{false}
-@end itemize
-@c variant
-@end table
-
-
@node %code Summary
@subsection %code Summary
@findex %code
@item Language(s): C, C++
@item Purpose: This is the best place to write dependency code required for
-@code{YYSTYPE} and @code{YYLTYPE}.
-In other words, it's the best place to define types referenced in @code{%union}
-directives, and it's the best place to override Bison's default @code{YYSTYPE}
-and @code{YYLTYPE} definitions.
+@code{YYSTYPE} and @code{YYLTYPE}. In other words, it's the best place to
+define types referenced in @code{%union} directives. If you use
+@code{#define} to override Bison's default @code{YYSTYPE} and @code{YYLTYPE}
+definitions, then it is also the best place. However you should rather
+@code{%define} @code{api.value.type} and @code{api.location.type}.
@item Location(s): The parser header file and the parser implementation file
before the Bison-generated @code{YYSTYPE} and @code{YYLTYPE}
@code{api.prefix}. With different @code{api.prefix}s it is guaranteed that
headers do not conflict when included together, and that compiled objects
can be linked together too. Specifying @samp{%define api.prefix
-@var{prefix}} (or passing the option @samp{-Dapi.prefix=@var{prefix}}, see
+@{@var{prefix}@}} (or passing the option @samp{-Dapi.prefix=@{@var{prefix}@}}, see
@ref{Invocation, ,Invoking Bison}) renames the interface functions and
variables of the Bison parser to start with @var{prefix} instead of
@samp{yy}, and all the macros to start by @var{PREFIX} (i.e., @var{prefix}
@code{YYSTYPE}, @code{YYLTYPE}, and @code{YYDEBUG}, which is treated
specifically --- more about this below.
-For example, if you use @samp{%define api.prefix c}, the names become
+For example, if you use @samp{%define api.prefix @{c@}}, the names become
@code{cparse}, @code{clex}, @dots{}, @code{CSTYPE}, @code{CLTYPE}, and so
on.
@code{YYDEBUG} (not renamed) is used as a default value:
@example
-/* Enabling traces. */
+/* Debug traces. */
#ifndef CDEBUG
# if defined YYDEBUG
# if YYDEBUG
exp: @dots{} @{ @dots{}; *randomness += 1; @dots{} @}
@end example
+@noindent
+Using the following:
+@example
+%parse-param @{int *randomness@}
+@end example
+
+Results in these signatures:
+@example
+void yyerror (int *randomness, const char *msg);
+int yyparse (int *randomness);
+@end example
+
+@noindent
+Or, if both @code{%define api.pure full} (or just @code{%define api.pure})
+and @code{%locations} are used:
+
+@example
+void yyerror (YYLTYPE *llocp, int *randomness, const char *msg);
+int yyparse (int *randomness);
+@end example
+
@node Push Parser Function
@section The Push Parser Function @code{yypush_parse}
@findex yypush_parse
@samp{%define api.push-pull both} declaration is used.
@xref{Push Decl, ,A Push Parser}.
-@deftypefun int yypush_parse (yypstate *yyps)
+@deftypefun int yypush_parse (yypstate *@var{yyps})
The value returned by @code{yypush_parse} is the same as for yyparse with
the following exception: it returns @code{YYPUSH_MORE} if more input is
required to finish parsing the grammar.
declaration is used.
@xref{Push Decl, ,A Push Parser}.
-@deftypefun int yypull_parse (yypstate *yyps)
+@deftypefun int yypull_parse (yypstate *@var{yyps})
The value returned by @code{yypull_parse} is the same as for @code{yyparse}.
@end deftypefun
@samp{%define api.push-pull both} declaration is used.
@xref{Push Decl, ,A Push Parser}.
-@deftypefun void yypstate_delete (yypstate *yyps)
+@deftypefun void yypstate_delete (yypstate *@var{yyps})
This function will reclaim the memory associated with a parser instance.
After this call, you should no longer attempt to use the parser instance.
@end deftypefun
return 0;
@dots{}
if (c == '+' || c == '-')
- return c; /* Assume token type for `+' is '+'. */
+ return c; /* Assume token type for '+' is '+'. */
@dots{}
return INT; /* Return the type of the token. */
@dots{}
When you are using multiple data types, @code{yylval}'s type is a union
made from the @code{%union} declaration (@pxref{Union Decl, ,The
-Collection of Value Types}). So when you store a token's value, you
+Union Declaration}). So when you store a token's value, you
must use the proper member of the union. If the @code{%union}
declaration looks like this:
@node Pure Calling
@subsection Calling Conventions for Pure Parsers
-When you use the Bison declaration @samp{%define api.pure} to request a
+When you use the Bison declaration @code{%define api.pure full} to request a
pure, reentrant parser, the global communication variables @code{yylval}
and @code{yylloc} cannot be used. (@xref{Pure Decl, ,A Pure (Reentrant)
Parser}.) In such parsers the two global variables are replaced by
declarations, which is equivalent to repeating @code{%param}.
@end deffn
+@noindent
For instance:
@example
int yyparse (parser_mode *mode, environment_type *env);
@end example
-If @samp{%define api.pure} is added:
+If @samp{%define api.pure full} is added:
@example
int yylex (YYSTYPE *lvalp, scanner_mode *mode, environment_type *env);
@end example
@noindent
-and finally, if both @samp{%define api.pure} and @code{%locations} are used:
+and finally, if both @samp{%define api.pure full} and @code{%locations} are
+used:
@example
int yylex (YYSTYPE *lvalp, YYLTYPE *llocp,
immediately return 1.
Obviously, in location tracking pure parsers, @code{yyerror} should have
-an access to the current location.
-This is indeed the case for the GLR
-parsers, but not for the Yacc parser, for historical reasons. I.e., if
-@samp{%locations %define api.pure} is passed then the prototypes for
-@code{yyerror} are:
-
-@example
-void yyerror (char const *msg); /* Yacc parsers. */
-void yyerror (YYLTYPE *locp, char const *msg); /* GLR parsers. */
-@end example
-
-If @samp{%parse-param @{int *nastiness@}} is used, then:
-
-@example
-void yyerror (int *nastiness, char const *msg); /* Yacc parsers. */
-void yyerror (int *nastiness, char const *msg); /* GLR parsers. */
-@end example
+an access to the current location. With @code{%define api.pure}, this is
+indeed the case for the GLR parsers, but not for the Yacc parser, for
+historical reasons, and this is the why @code{%define api.pure full} should be
+prefered over @code{%define api.pure}.
-Finally, GLR and Yacc parsers share the same @code{yyerror} calling
-convention for absolutely pure parsers, i.e., when the calling
-convention of @code{yylex} @emph{and} the calling convention of
-@samp{%define api.pure} are pure.
-I.e.:
+When @code{%locations %define api.pure full} is used, @code{yyerror} has the
+following signature:
@example
-/* Location tracking. */
-%locations
-/* Pure yylex. */
-%define api.pure
-%lex-param @{int *nastiness@}
-/* Pure yyparse. */
-%parse-param @{int *nastiness@}
-%parse-param @{int *randomness@}
-@end example
-
-@noindent
-results in the following signatures for all the parser kinds:
-
-@example
-int yylex (YYSTYPE *lvalp, YYLTYPE *llocp, int *nastiness);
-int yyparse (int *nastiness, int *randomness);
-void yyerror (YYLTYPE *locp,
- int *nastiness, int *randomness,
- char const *msg);
+void yyerror (YYLTYPE *locp, char const *msg);
@end example
@noindent
@end deffn
@deffn {Value} @@$
-@findex @@$
Acts like a structure variable containing information on the textual
location of the grouping made by the current rule. @xref{Tracking
Locations}.
@item
@cindex bison-i18n.m4
Into the directory containing the GNU Autoconf macros used
-by the package---often called @file{m4}---copy the
+by the package ---often called @file{m4}--- copy the
@file{bison-i18n.m4} file installed by Bison under
@samp{share/aclocal/bison-i18n.m4} in Bison's installation directory.
For example:
@example
@group
sequence:
- /* empty */ @{ printf ("empty sequence\n"); @}
+ %empty @{ printf ("empty sequence\n"); @}
| maybeword
| sequence word @{ printf ("added word %s\n", $2); @}
;
@group
maybeword:
- /* empty */ @{ printf ("empty maybeword\n"); @}
-| word @{ printf ("single word %s\n", $1); @}
+ %empty @{ printf ("empty maybeword\n"); @}
+| word @{ printf ("single word %s\n", $1); @}
;
@end group
@end example
@example
@group
sequence:
- /* empty */ @{ printf ("empty sequence\n"); @}
+ %empty @{ printf ("empty sequence\n"); @}
| sequence word @{ printf ("added word %s\n", $2); @}
;
@end group
@example
@group
sequence:
- /* empty */
+ %empty
| sequence words
| sequence redirects
;
@group
words:
- /* empty */
+ %empty
| words word
;
@end group
@group
redirects:
- /* empty */
+ %empty
| redirects redirect
;
@end group
@example
sequence:
- /* empty */
+ %empty
| sequence word
| sequence redirect
;
@example
@group
sequence:
- /* empty */
+ %empty
| sequence words
| sequence redirects
;
%%
@group
sequence:
- /* empty */
+ %empty
| sequence word %prec "sequence"
| sequence redirect %prec "sequence"
;
%%
@group
sequence:
- /* empty */
+ %empty
| sequence word %prec "word"
| sequence redirect %prec "redirect"
;
parser table construction algorithm by using the @code{%define lr.type}
directive.
-@deffn {Directive} {%define lr.type @var{TYPE}}
+@deffn {Directive} {%define lr.type} @var{type}
Specify the type of parser tables within the LR(1) family. The accepted
-values for @var{TYPE} are:
+values for @var{type} are:
@itemize
@item @code{lalr} (default)
To adjust which states have default reductions enabled, use the
@code{%define lr.default-reduction} directive.
-@deffn {Directive} {%define lr.default-reduction @var{WHERE}}
+@deffn {Directive} {%define lr.default-reduction} @var{where}
Specify the kind of states that are permitted to contain default reductions.
-The accepted values of @var{WHERE} are:
+The accepted values of @var{where} are:
@itemize
@item @code{most} (default for LALR and IELR)
@item @code{consistent}
sacrificing @code{%nonassoc}, default reductions, or state merging. You can
enable LAC with the @code{%define parse.lac} directive.
-@deffn {Directive} {%define parse.lac @var{VALUE}}
+@deffn {Directive} {%define parse.lac} @var{value}
Enable LAC to improve syntax error handling.
@itemize
@item @code{none} (default)
keeping unreachable states is sometimes useful when trying to understand the
relationship between the parser and the grammar.
-@deffn {Directive} {%define lr.keep-unreachable-state @var{VALUE}}
+@deffn {Directive} {%define lr.keep-unreachable-state} @var{value}
Request that Bison allow unreachable states to remain in the parser tables.
-@var{VALUE} must be a Boolean. The default is @code{false}.
+@var{value} must be a Boolean. The default is @code{false}.
@end deffn
There are a few caveats to consider:
@example
stmts:
- /* empty string */
+ %empty
| stmts '\n'
| stmts exp '\n'
| stmts error '\n'
Developing a parser can be a challenge, especially if you don't understand
the algorithm (@pxref{Algorithm, ,The Bison Parser Algorithm}). This
-chapter explains how to generate and read the detailed description of the
-automaton, and how to enable and understand the parser run-time traces.
+chapter explains how understand and debug a parser.
+
+The first sections focus on the static part of the parser: its structure.
+They explain how to generate and read the detailed description of the
+automaton. There are several formats available:
+@itemize @minus
+@item
+as text, see @ref{Understanding, , Understanding Your Parser};
+
+@item
+as a graph, see @ref{Graphviz,, Visualizing Your Parser};
+
+@item
+or as a markup report that can be turned, for instance, into HTML, see
+@ref{Xml,, Visualizing your parser in multiple formats}.
+@end itemize
+
+The last section focuses on the dynamic part of the parser: how to enable
+and understand the parser run-time traces (@pxref{Tracing, ,Tracing Your
+Parser}).
@menu
* Understanding:: Understanding the structure of your parser.
As documented elsewhere (@pxref{Algorithm, ,The Bison Parser Algorithm})
Bison parsers are @dfn{shift/reduce automata}. In some cases (much more
frequent than one would hope), looking at this automaton is required to
-tune or simply fix a parser. Bison provides two different
-representation of it, either textually or graphically (as a DOT file).
+tune or simply fix a parser.
The textual file is generated when the options @option{--report} or
@option{--verbose} are specified, see @ref{Invocation, , Invoking
@example
%token NUM STR
+@group
%left '+' '-'
%left '*'
+@end group
%%
+@group
exp:
exp '+' exp
| exp '-' exp
| exp '/' exp
| NUM
;
+@end group
useless: STR;
%%
@end example
@example
calc.y: warning: 1 nonterminal useless in grammar
calc.y: warning: 1 rule useless in grammar
-calc.y:11.1-7: warning: nonterminal useless in grammar: useless
-calc.y:11.10-12: warning: rule useless in grammar: useless: STR
+calc.y:12.1-7: warning: nonterminal useless in grammar: useless
+calc.y:12.10-12: warning: rule useless in grammar: useless: STR
calc.y: conflicts: 7 shift/reduce
@end example
the location of the input cursor.
@example
-state 0
+State 0
0 $accept: . exp $end
@option{--report=itemset} to list the derived items as well:
@example
-state 0
+State 0
0 $accept: . exp $end
1 exp: . exp '+' exp
In the state 1@dots{}
@example
-state 1
+State 1
5 exp: NUM .
@noindent
the rule 5, @samp{exp: NUM;}, is completed. Whatever the lookahead token
(@samp{$default}), the parser will reduce it. If it was coming from
-state 0, then, after this reduction it will return to state 0, and will
+State 0, then, after this reduction it will return to state 0, and will
jump to state 2 (@samp{exp: go to state 2}).
@example
-state 2
+State 2
0 $accept: exp . $end
1 exp: exp . '+' exp
state}:
@example
-state 3
+State 3
0 $accept: exp $end .
the reader.
@example
-state 4
+State 4
1 exp: exp '+' . exp
exp go to state 8
-state 5
+State 5
2 exp: exp '-' . exp
exp go to state 9
-state 6
+State 6
3 exp: exp '*' . exp
exp go to state 10
-state 7
+State 7
4 exp: exp '/' . exp
1 shift/reduce}:
@example
-state 8
+State 8
1 exp: exp . '+' exp
1 | exp '+' exp .
@option{--report=lookahead}, Bison specifies these lookahead tokens:
@example
-state 8
+State 8
1 exp: exp . '+' exp
1 | exp '+' exp . [$end, '+', '-', '/']
@example
@group
-state 9
+State 9
1 exp: exp . '+' exp
2 | exp . '-' exp
@end group
@group
-state 10
+State 10
1 exp: exp . '+' exp
2 | exp . '-' exp
@end group
@group
-state 11
+State 11
1 exp: exp . '+' exp
2 | exp . '-' exp
@noindent
Observe that state 11 contains conflicts not only due to the lack of
-precedence of @samp{/} with respect to @samp{+}, @samp{-}, and
-@samp{*}, but also because the
-associativity of @samp{/} is not specified.
+precedence of @samp{/} with respect to @samp{+}, @samp{-}, and @samp{*}, but
+also because the associativity of @samp{/} is not specified.
-Note that Bison may also produce an HTML version of this output, via an XML
-file and XSLT processing (@pxref{Xml}).
+Bison may also produce an HTML version of this output, via an XML file and
+XSLT processing (@pxref{Xml,,Visualizing your parser in multiple formats}).
@c ================================================= Graphical Representation
(@pxref{Invocation, , Invoking Bison}). Its name is made by removing
@samp{.tab.c} or @samp{.c} from the parser implementation file name, and
adding @samp{.dot} instead. If the grammar file is @file{foo.y}, the
-Graphviz output file is called @file{foo.dot}.
+Graphviz output file is called @file{foo.dot}. A DOT file may also be
+produced via an XML file and XSLT processing (@pxref{Xml,,Visualizing your
+parser in multiple formats}).
+
The following grammar file, @file{rr.y}, will be used in the sequel:
@end group
@end example
-The graphical output is very similar to the textual one, and as such it is
-easier understood by making direct comparisons between them. See
-@ref{Debugging, , Debugging Your Parser} for a detailled analysis of the
-textual report.
+The graphical output
+@ifnotinfo
+(see @ref{fig:graph})
+@end ifnotinfo
+is very similar to the textual one, and as such it is easier understood by
+making direct comparisons between them. @xref{Debugging, , Debugging Your
+Parser}, for a detailled analysis of the textual report.
+
+@ifnotinfo
+@float Figure,fig:graph
+@image{figs/example, 430pt}
+@caption{A graphical rendering of the parser.}
+@end float
+@end ifnotinfo
@subheading Graphical Representation of States
@example
@group
-state 3
+State 3
1 exp: a . ";"
This is how reductions are represented in the verbose file @file{rr.output}:
@example
-state 1
+State 1
3 a: "0" . [";"]
4 b: "0" . ["."]
are distinguished by a red filling color on these nodes, just like how they are
reported between square brackets in the verbose file.
-The reduction corresponding to the rule number 0 is the acceptation state. It
-is shown as a blue diamond, labelled "Acc".
+The reduction corresponding to the rule number 0 is the acceptation
+state. It is shown as a blue diamond, labelled ``Acc''.
@subheading Graphical representation of go tos
The @samp{go to} jump transitions are represented as dotted lines bearing
the name of the rule being jumped to.
-Note that a DOT file may also be produced via an XML file and XSLT
-processing (@pxref{Xml}).
-
@c ================================================= XML
@node Xml
@cindex xml
Bison supports two major report formats: textual output
-(@pxref{Understanding}) when invoked with option @option{--verbose}, and DOT
-(@pxref{Graphviz}) when invoked with option @option{--graph}. However,
+(@pxref{Understanding, ,Understanding Your Parser}) when invoked
+with option @option{--verbose}, and DOT
+(@pxref{Graphviz,, Visualizing Your Parser}) when invoked with
+option @option{--graph}. However,
another alternative is to output an XML file that may then be, with
@command{xsltproc}, rendered as either a raw text format equivalent to the
verbose file, or as an HTML version of the same file, with clickable
XSLT have no difference whatsoever with those obtained by invoking
@command{bison} with options @option{--verbose} or @option{--graph}.
-The textual file is generated when the options @option{-x} or
+The XML file is generated when the options @option{-x} or
@option{--xml[=FILE]} are specified, see @ref{Invocation,,Invoking Bison}.
If not specified, its name is made by removing @samp{.tab.c} or @samp{.c}
from the parser implementation file name, and adding @samp{.xml} instead.
@item xml2dot.xsl
Used to output a copy of the DOT visualization of the automaton.
@item xml2text.xsl
-Used to output a copy of the .output file.
+Used to output a copy of the @samp{.output} file.
@item xml2xhtml.xsl
-Used to output an xhtml enhancement of the .output file.
+Used to output an xhtml enhancement of the @samp{.output} file.
@end table
-Sample usage (requires @code{xsltproc}):
+Sample usage (requires @command{xsltproc}):
@example
-$ bison -x input.y
+$ bison -x gr.y
@group
$ bison --print-datadir
/usr/local/share/bison
@end group
-$ xsltproc /usr/local/share/bison/xslt/xml2xhtml.xsl input.xml > input.html
+$ xsltproc /usr/local/share/bison/xslt/xml2xhtml.xsl gr.xml >gr.html
@end example
@c ================================================= Tracing
@item the option @option{-t} (POSIX Yacc compliant)
@itemx the option @option{--debug} (Bison extension)
Use the @samp{-t} option when you run Bison (@pxref{Invocation, ,Invoking
-Bison}). With @samp{%define api.prefix c}, it defines @code{CDEBUG} to 1,
+Bison}). With @samp{%define api.prefix @{c@}}, it defines @code{CDEBUG} to 1,
otherwise it defines @code{YYDEBUG} to 1.
@item the directive @samp{%debug}
/* Formatting semantic values. */
%printer @{ fprintf (yyoutput, "%s", $$->name); @} VAR;
%printer @{ fprintf (yyoutput, "%s()", $$->name); @} FNCT;
-%printer @{ fprintf (yyoutput, "%g", $$); @} <val>;
+%printer @{ fprintf (yyoutput, "%g", $$); @} <double>;
@end example
The @code{%define} directive instructs Bison to generate run-time trace
The set of @code{%printer} directives demonstrates how to format the
semantic value in the traces. Note that the specification can be done
either on the symbol type (e.g., @code{VAR} or @code{FNCT}), or on the type
-tag: since @code{<val>} is the type for both @code{NUM} and @code{exp}, this
-printer will be used for them.
+tag: since @code{<double>} is the type for both @code{NUM} and @code{exp},
+this printer will be used for them.
Here is a sample of the information provided by run-time traces. The traces
are sent onto standard error.
@noindent
The previous reduction demonstrates the @code{%printer} directive for
-@code{<val>}: both the token @code{NUM} and the resulting non-terminal
+@code{<double>}: both the token @code{NUM} and the resulting nonterminal
@code{exp} have @samp{1} as value.
@example
short option. It is followed by a cross key alphabetized by long
option.
-@c Please, keep this ordered as in `bison --help'.
+@c Please, keep this ordered as in 'bison --help'.
@noindent
Operations modes:
@table @option
Deprecated constructs whose support will be removed in future versions of
Bison.
+@item empty-rule
+Empty rules without @code{%empty}. @xref{Empty Rules}. Disabled by
+default, but enabled by uses of @code{%empty}, unless
+@option{-Wno-empty-rule} was specified.
+
+@item precedence
+Useless precedence and associativity directives. Disabled by default.
+
+Consider for instance the following grammar:
+
+@example
+@group
+%nonassoc "="
+%left "+"
+%left "*"
+%precedence "("
+@end group
+%%
+@group
+stmt:
+ exp
+| "var" "=" exp
+;
+@end group
+
+@group
+exp:
+ exp "+" exp
+| exp "*" "num"
+| "(" exp ")"
+| "num"
+;
+@end group
+@end example
+
+Bison reports:
+
+@c cannot leave the location and the [-Wprecedence] for lack of
+@c width in PDF.
+@example
+@group
+warning: useless precedence and associativity for "="
+ %nonassoc "="
+ ^^^
+@end group
+@group
+warning: useless associativity for "*", use %precedence
+ %left "*"
+ ^^^
+@end group
+@group
+warning: useless precedence for "("
+ %precedence "("
+ ^^^
+@end group
+@end example
+
+One would get the exact same parser with the following directives instead:
+
+@example
+@group
+%left "+"
+%precedence "*"
+@end group
+@end example
+
@item other
All warnings not categorized above. These warnings are enabled by default.
categories.
@item all
-All the warnings.
+All the warnings except @code{yacc}.
+
@item none
Turn off all the warnings.
+
@item error
See @option{-Werror}, below.
@end table
instance, @option{-Wno-yacc} will hide the warnings about
POSIX Yacc incompatibilities.
-@item -Werror[=@var{category}]
-@itemx -Wno-error[=@var{category}]
-Enable warnings falling in @var{category}, and treat them as errors. If no
-@var{category} is given, it defaults to making all enabled warnings into errors.
+@item -Werror
+Turn enabled warnings for every @var{category} into errors, unless they are
+explicitly disabled by @option{-Wno-error=@var{category}}.
+
+@item -Werror=@var{category}
+Enable warnings falling in @var{category}, and treat them as errors.
@var{category} is the same as for @option{--warnings}, with the exception that
it may not be prefixed with @samp{no-} (see above).
-Prefixed with @samp{no}, it deactivates the error treatment for this
-@var{category}. However, the warning itself won't be disabled, or enabled, by
-this option.
-
Note that the precedence of the @samp{=} and @samp{,} operators is such that
the following commands are @emph{not} equivalent, as the first will not treat
S/R conflicts as errors.
$ bison -Werror=yacc,conflicts-sr input.y
$ bison -Werror=yacc,error=conflicts-sr input.y
@end example
+
+@item -Wno-error
+Do not turn enabled warnings for every @var{category} into errors, unless
+they are explicitly enabled by @option{-Werror=@var{category}}.
+
+@item -Wno-error=@var{category}
+Deactivate the error treatment for this @var{category}. However, the warning
+itself won't be disabled, or enabled, by this option.
+
+@item -f [@var{feature}]
+@itemx --feature[=@var{feature}]
+Activate miscellaneous @var{feature}. @var{feature} can be one of:
+@table @code
+@item caret
+@itemx diagnostics-show-caret
+Show caret errors, in a manner similar to GCC's
+@option{-fdiagnostics-show-caret}, or Clang's @option{-fcaret-diagnotics}. The
+location provided with the message is used to quote the corresponding line of
+the source file, underlining the important part of it with carets (^). Here is
+an example, using the following file @file{in.y}:
+
+@example
+%type <ival> exp
+%%
+exp: exp '+' exp @{ $exp = $1 + $2; @};
+@end example
+
+When invoked with @option{-fcaret} (or nothing), Bison will report:
+
+@example
+@group
+in.y:3.20-23: error: ambiguous reference: '$exp'
+ exp: exp '+' exp @{ $exp = $1 + $2; @};
+ ^^^^
+@end group
+@group
+in.y:3.1-3: refers to: $exp at $$
+ exp: exp '+' exp @{ $exp = $1 + $2; @};
+ ^^^
+@end group
+@group
+in.y:3.6-8: refers to: $exp at $1
+ exp: exp '+' exp @{ $exp = $1 + $2; @};
+ ^^^
+@end group
+@group
+in.y:3.14-16: refers to: $exp at $3
+ exp: exp '+' exp @{ $exp = $1 + $2; @};
+ ^^^
+@end group
+@group
+in.y:3.32-33: error: $2 of 'exp' has no declared type
+ exp: exp '+' exp @{ $exp = $1 + $2; @};
+ ^^
+@end group
+@end example
+
+Whereas, when invoked with @option{-fno-caret}, Bison will only report:
+
+@example
+@group
+in.y:3.20-23: error: ambiguous reference: ‘$exp’
+in.y:3.1-3: refers to: $exp at $$
+in.y:3.6-8: refers to: $exp at $1
+in.y:3.14-16: refers to: $exp at $3
+in.y:3.32-33: error: $2 of ‘exp’ has no declared type
+@end group
+@end example
+
+This option is activated by default.
+
+@end table
@end table
@noindent
Summary}). Currently supported languages include C, C++, and Java.
@var{language} is case-insensitive.
-This option is experimental and its effect may be modified in future
-releases.
-
@item --locations
Pretend that @code{%locations} was specified. @xref{Decl Summary}.
@subsubsection C++ Unions
The @code{%union} directive works as for C, see @ref{Union Decl, ,The
-Collection of Value Types}. In particular it produces a genuine
+Union Declaration}. In particular it produces a genuine
@code{union}, which have a few specific features in C++.
@itemize @minus
@item
@node C++ Variants
@subsubsection C++ Variants
-Starting with version 2.6, Bison provides a @emph{variant} based
-implementation of semantic values for C++. This alleviates all the
-limitations reported in the previous section, and in particular, object
-types can be used without pointers.
+Bison provides a @emph{variant} based implementation of semantic values for
+C++. This alleviates all the limitations reported in the previous section,
+and in particular, object types can be used without pointers.
To enable variant-based semantic values, set @code{%define} variable
@code{variant} (@pxref{%define Summary,, variant}). Once this defined,
type on all platforms, alignments are enforced for @code{double} whatever
types are actually used. This may waste space in some cases.
-@item
-Our implementation is not conforming with strict aliasing rules. Alias
-analysis is a technique used in optimizing compilers to detect when two
-pointers are disjoint (they cannot ``meet''). Our implementation breaks
-some of the rules that G++ 4.4 uses in its alias analysis, so @emph{strict
-alias analysis must be disabled}. Use the option
-@option{-fno-strict-aliasing} to compile the generated parser.
-
@item
There might be portability issues we are not aware of.
@end itemize
The line, starting at 1.
@end deftypeivar
-@deftypemethod {position} {uint} lines (int @var{height} = 1)
-Advance by @var{height} lines, resetting the column number.
+@deftypemethod {position} {void} lines (int @var{height} = 1)
+If @var{height} is not null, advance by @var{height} lines, resetting the
+column number. The resulting line number cannot be less than 1.
@end deftypemethod
@deftypeivar {position} {uint} column
The column, starting at 1.
@end deftypeivar
-@deftypemethod {position} {uint} columns (int @var{width} = 1)
-Advance by @var{width} columns, without changing the line number.
+@deftypemethod {position} {void} columns (int @var{width} = 1)
+Advance by @var{width} columns, without changing the line number. The
+resulting column number cannot be less than 1.
@end deftypemethod
@deftypemethod {position} {position&} operator+= (int @var{width})
The first, inclusive, position of the range, and the first beyond.
@end deftypeivar
-@deftypemethod {location} {uint} columns (int @var{width} = 1)
-@deftypemethodx {location} {uint} lines (int @var{height} = 1)
-Advance the @code{end} position.
+@deftypemethod {location} {void} columns (int @var{width} = 1)
+@deftypemethodx {location} {void} lines (int @var{height} = 1)
+Forwarded to the @code{end} position.
@end deftypemethod
@deftypemethod {location} {location} operator+ (const location& @var{end})
@deftypemethodx {location} {location} operator+ (int @var{width})
@deftypemethodx {location} {location} operator+= (int @var{width})
+@deftypemethodx {location} {location} operator- (int @var{width})
+@deftypemethodx {location} {location} operator-= (int @var{width})
Various forms of syntactic sugar.
@end deftypemethod
@code{api.location.type} to specify your own type:
@example
-%define api.location.type @var{LocationType}
+%define api.location.type @{@var{LocationType}@}
@end example
The requirements over your @var{LocationType} are:
@example
%defines
%locations
-%define namespace "master::"
+%define api.namespace @{master::@}
@end example
@noindent
files, reused by other parsers as follows:
@example
-%define api.location.type "master::location"
+%define api.location.type @{master::location@}
%code requires @{ #include <master/location.hh> @}
@end example
The output files @file{@var{output}.hh} and @file{@var{output}.cc}
declare and define the parser class in the namespace @code{yy}. The
class name defaults to @code{parser}, but may be changed using
-@samp{%define parser_class_name "@var{name}"}. The interface of
+@samp{%define parser_class_name @{@var{name}@}}. The interface of
this class is detailed below. It can be extended using the
@code{%parse-param} feature: its semantics is slightly changed since
it describes an additional member of the parser class, and an
@node Split Symbols
@subsubsection Split Symbols
-Therefore the interface is as follows.
+The interface is as follows.
@deftypemethod {parser} {int} yylex (semantic_type* @var{yylval}, location_type* @var{yylloc}, @var{type1} @var{arg1}, ...)
@deftypemethodx {parser} {int} yylex (semantic_type* @var{yylval}, @var{type1} @var{arg1}, ...)
@node Complete Symbols
@subsubsection Complete Symbols
-If you specified both @code{%define variant} and
+If you specified both @code{%define api.value.type variant} and
@code{%define api.token.constructor},
the @code{parser} class also defines the class @code{parser::symbol_type}
which defines a @emph{complete} symbol, aggregating its type (i.e., the
For instance, given the following declarations:
@example
-%define api.token.prefix "TOK_"
+%define api.token.prefix @{TOK_@}
%token <std::string> IDENTIFIER;
%token <int> INTEGER;
%token COLON;
%skeleton "lalr1.cc" /* -*- C++ -*- */
%require "@value{VERSION}"
%defines
-%define parser_class_name "calcxx_parser"
+%define parser_class_name @{calcxx_parser@}
@end example
@noindent
@findex %define api.token.constructor
-@findex %define variant
+@findex %define api.value.type variant
This example will use genuine C++ objects as semantic values, therefore, we
require the variant-based interface. To make sure we properly use it, we
enable assertions. To fully benefit from type-safety and more natural
@comment file: calc++-parser.yy
@example
%define api.token.constructor
+%define api.value.type variant
%define parse.assert
-%define variant
@end example
@noindent
@comment file: calc++-parser.yy
@example
-%define api.token.prefix "TOK_"
+%define api.token.prefix @{TOK_@}
%token
END 0 "end of file"
ASSIGN ":="
unit: assignments exp @{ driver.result = $2; @};
assignments:
- /* Nothing. */ @{@}
+ %empty @{@}
| assignments assignment @{@};
assignment:
@comment file: calc++-scanner.ll
@example
-%option noyywrap nounput batch debug
+%option noyywrap nounput batch debug noinput
@end example
@noindent
* Java Parser Interface:: Instantiating and running the parser
* Java Scanner Interface:: Specifying the scanner for the parser
* Java Action Features:: Special features for use in actions
+* Java Push Parser Interface:: Instantiating and running the a push parser
* Java Differences:: Differences between C/C++ and Java Grammars
* Java Declarations Summary:: List of Bison declarations used with Java
@end menu
Contrary to C parsers, Java parsers do not use global variables; the
state of the parser is always local to an instance of the parser class.
Therefore, all Java parsers are ``pure'', and the @code{%pure-parser}
-and @samp{%define api.pure} directives does not do anything when used in
-Java.
+and @code{%define api.pure} directives do nothing when used in Java.
Push parsers are currently unsupported in Java and @code{%define
api.push-pull} have no effect.
By default, the semantic stack is declared to have @code{Object} members,
which means that the class types you specify can be of any class.
To improve the type safety of the parser, you can declare the common
-superclass of all the semantic values using the @samp{%define stype}
+superclass of all the semantic values using the @samp{%define api.value.type}
directive. For example, after the following declaration:
@example
-%define stype "ASTNode"
+%define api.value.type @{ASTNode@}
@end example
@noindent
defines a class representing a @dfn{location}, a range composed of a pair of
positions (possibly spanning several files). The location class is an inner
class of the parser; the name is @code{Location} by default, and may also be
-renamed using @code{%define api.location.type "@var{class-name}"}.
+renamed using @code{%define api.location.type @{@var{class-name}@}}.
The location class treats the position as a completely opaque value.
By default, the class name is @code{Position}, but this can be changed
-with @code{%define api.position.type "@var{class-name}"}. This class must
+with @code{%define api.position.type @{@var{class-name}@}}. This class must
be supplied by the user.
The name of the generated parser class defaults to @code{YYParser}. The
@code{YY} prefix may be changed using the @code{%name-prefix} directive
or the @option{-p}/@option{--name-prefix} option. Alternatively, use
-@samp{%define parser_class_name "@var{name}"} to give a custom name to
+@samp{%define parser_class_name @{@var{name}@}} to give a custom name to
the class. The interface of this class is detailed below.
By default, the parser class has package visibility. A declaration
file should match the name of the class in this case. Similarly, you can
use @code{abstract}, @code{final} and @code{strictfp} with the
@code{%define} declaration to add other modifiers to the parser class.
-A single @samp{%define annotations "@var{annotations}"} directive can
+A single @samp{%define annotations @{@var{annotations}@}} directive can
be used to add any number of annotations to the parser class.
The Java package name of the parser class can be specified using the
@deftypemethod {Lexer} {void} yyerror (Location @var{loc}, String @var{msg})
This method is defined by the user to emit an error message. The first
parameter is omitted if location tracking is not active. Its type can be
-changed using @code{%define api.location.type "@var{class-name}".}
+changed using @code{%define api.location.type @{@var{class-name}@}}.
@end deftypemethod
@deftypemethod {Lexer} {int} yylex ()
methods are not needed unless location tracking is active.
The return type can be changed using @code{%define api.position.type
-"@var{class-name}".}
+@{@var{class-name}@}}.
@end deftypemethod
@deftypemethod {Lexer} {Object} getLVal ()
Return the semantic value of the last token that yylex returned.
-The return type can be changed using @samp{%define stype
-"@var{class-name}".}
+The return type can be changed using @samp{%define api.value.type
+@{@var{class-name}@}}.
@end deftypemethod
-
@node Java Action Features
@subsection Special Features for Use in Java Actions
@defvar $$
The semantic value for the grouping made by the current rule. As a
value, this is in the base type (@code{Object} or as specified by
-@samp{%define stype}) as in not cast to the declared subtype because
+@samp{%define api.value.type}) as in not cast to the declared subtype because
casts are not allowed on the left-hand side of Java assignments.
Use an explicit Java cast if the correct subtype is needed.
@xref{Java Semantic Values}.
available only if location tracking is active.
@end deftypefn
+@node Java Push Parser Interface
+@subsection Java Push Parser Interface
+@c - define push_parse
+@findex %define api.push-pull
+
+(The current push parsing interface is experimental and may evolve. More
+user feedback will help to stabilize it.)
+
+Normally, Bison generates a pull parser for Java.
+The following Bison declaration says that you want the parser to be a push
+parser (@pxref{%define Summary,,api.push-pull}):
+
+@example
+%define api.push-pull push
+@end example
+
+Most of the discussion about the Java pull Parser Interface, (@pxref{Java
+Parser Interface}) applies to the push parser interface as well.
+
+When generating a push parser, the method @code{push_parse} is created with
+the following signature (depending on if locations are enabled).
+
+@deftypemethod {YYParser} {void} push_parse ({int} @var{token}, {Object} @var{yylval})
+@deftypemethodx {YYParser} {void} push_parse ({int} @var{token}, {Object} @var{yylval}, {Location} @var{yyloc})
+@deftypemethodx {YYParser} {void} push_parse ({int} @var{token}, {Object} @var{yylval}, {Position} @var{yypos})
+@end deftypemethod
+
+The primary difference with respect to a pull parser is that the parser
+method @code{push_parse} is invoked repeatedly to parse each token. This
+function is available if either the "%define api.push-pull push" or "%define
+api.push-pull both" declaration is used (@pxref{%define
+Summary,,api.push-pull}). The @code{Location} and @code{Position}
+parameters are available only if location tracking is active.
+
+The value returned by the @code{push_parse} method is one of the following
+four constants: @code{YYABORT}, @code{YYACCEPT}, @code{YYERROR}, or
+@code{YYPUSH_MORE}. This new value, @code{YYPUSH_MORE}, may be returned if
+more input is required to finish parsing the grammar.
+
+If api.push-pull is declared as @code{both}, then the generated parser class
+will also implement the @code{parse} method. This method's body is a loop
+that repeatedly invokes the scanner and then passes the values obtained from
+the scanner to the @code{push_parse} method.
+
+There is one additional complication. Technically, the push parser does not
+need to know about the scanner (i.e. an object implementing the
+@code{YYParser.Lexer} interface), but it does need access to the
+@code{yyerror} method. Currently, the @code{yyerror} method is defined in
+the @code{YYParser.Lexer} interface. Hence, an implementation of that
+interface is still required in order to provide an implementation of
+@code{yyerror}. The current approach (and subject to change) is to require
+the @code{YYParser} constructor to be given an object implementing the
+@code{YYParser.Lexer} interface. This object need only implement the
+@code{yyerror} method; the other methods can be stubbed since they will
+never be invoked. The simplest way to do this is to add a trivial scanner
+implementation to your grammar file using whatever implementation of
+@code{yyerror} is desired. The following code sample shows a simple way to
+accomplish this.
+
+@example
+%code lexer
+@{
+ public Object getLVal () @{return null;@}
+ public int yylex () @{return 0;@}
+ public void yyerror (String s) @{System.err.println(s);@}
+@}
+@end example
@node Java Differences
@subsection Differences between C/C++ and Java Grammars
@item
Java lacks unions, so @code{%union} has no effect. Instead, semantic
values have a common base type: @code{Object} or as specified by
-@samp{%define stype}. Angle brackets on @code{%token}, @code{type},
+@samp{%define api.value.type}. Angle brackets on @code{%token}, @code{type},
@code{$@var{n}} and @code{$$} specify subtypes rather than fields of
an union. The type of @code{$$}, even with angle brackets, is the base
type since Java casts are not allow on the left-hand side of assignments.
@xref{Java Bison Interface}.
@end deffn
-@deffn {Directive} {%define annotations} "@var{annotations}"
+@deffn {Directive} {%define annotations} @{@var{annotations}@}
The Java annotations for the parser class. Default is none.
@xref{Java Bison Interface}.
@end deffn
-@deffn {Directive} {%define extends} "@var{superclass}"
+@deffn {Directive} {%define extends} @{@var{superclass}@}
The superclass of the parser class. Default is none.
@xref{Java Bison Interface}.
@end deffn
@xref{Java Bison Interface}.
@end deffn
-@deffn {Directive} {%define implements} "@var{interfaces}"
+@deffn {Directive} {%define implements} @{@var{interfaces}@}
The implemented interfaces of the parser class, a comma-separated list.
Default is none.
@xref{Java Bison Interface}.
@end deffn
-@deffn {Directive} {%define init_throws} "@var{exceptions}"
+@deffn {Directive} {%define init_throws} @{@var{exceptions}@}
The exceptions thrown by @code{%code init} from the parser class
constructor. Default is none.
@xref{Java Parser Interface}.
@end deffn
-@deffn {Directive} {%define lex_throws} "@var{exceptions}"
+@deffn {Directive} {%define lex_throws} @{@var{exceptions}@}
The exceptions thrown by the @code{yylex} method of the lexer, a
comma-separated list. Default is @code{java.io.IOException}.
@xref{Java Scanner Interface}.
@end deffn
-@deffn {Directive} {%define api.location.type} "@var{class}"
+@deffn {Directive} {%define api.location.type} @{@var{class}@}
The name of the class used for locations (a range between two
positions). This class is generated as an inner class of the parser
class by @command{bison}. Default is @code{Location}.
@xref{Java Location Values}.
@end deffn
-@deffn {Directive} {%define package} "@var{package}"
+@deffn {Directive} {%define package} @{@var{package}@}
The package to put the parser class in. Default is none.
@xref{Java Bison Interface}.
@end deffn
-@deffn {Directive} {%define parser_class_name} "@var{name}"
+@deffn {Directive} {%define parser_class_name} @{@var{name}@}
The name of the parser class. Default is @code{YYParser} or
@code{@var{name-prefix}Parser}.
@xref{Java Bison Interface}.
@end deffn
-@deffn {Directive} {%define api.position.type} "@var{class}"
+@deffn {Directive} {%define api.position.type} @{@var{class}@}
The name of the class used for positions. This class must be supplied by
the user. Default is @code{Position}.
Formerly named @code{position_type}.
@xref{Java Bison Interface}.
@end deffn
-@deffn {Directive} {%define stype} "@var{class}"
+@deffn {Directive} {%define api.value.type} @{@var{class}@}
The base type of semantic values. Default is @code{Object}.
@xref{Java Semantic Values}.
@end deffn
@xref{Java Bison Interface}.
@end deffn
-@deffn {Directive} {%define throws} "@var{exceptions}"
+@deffn {Directive} {%define throws} @{@var{exceptions}@}
The exceptions thrown by user-supplied parser actions and
@code{%initial-action}, a comma-separated list. Default is none.
@xref{Java Parser Interface}.
@quotation
My parser includes support for an @samp{#include}-like feature, in
which case I run @code{yyparse} from @code{yyparse}. This fails
-although I did specify @samp{%define api.pure}.
+although I did specify @samp{%define api.pure full}.
@end quotation
These problems typically come not from Bison itself, but from
version. If you have trouble compiling, you should also include a
transcript of the build session, starting with the invocation of
`configure'. Depending on the nature of the bug, you may be asked to
-send additional files as well (such as `config.h' or `config.cache').
+send additional files as well (such as @file{config.h} or @file{config.cache}).
Patches are most welcome, but not required. That is, do not hesitate to
send a bug report just because you cannot provide a fix.
@end deffn
@deffn {Variable} @@@var{n}
+@deffnx {Symbol} @@@var{n}
In an action, the location of the @var{n}-th symbol of the right-hand side
of the rule. @xref{Tracking Locations}.
+
+In a grammar, the Bison-generated nonterminal symbol for a mid-rule action
+with a semantical value. @xref{Mid-Rule Action Translation}.
@end deffn
@deffn {Variable} @@@var{name}
-In an action, the location of a symbol addressed by name. @xref{Tracking
-Locations}.
+@deffnx {Variable} @@[@var{name}]
+In an action, the location of a symbol addressed by @var{name}.
+@xref{Tracking Locations}.
@end deffn
-@deffn {Variable} @@[@var{name}]
-In an action, the location of a symbol addressed by name. @xref{Tracking
-Locations}.
+@deffn {Symbol} $@@@var{n}
+In a grammar, the Bison-generated nonterminal symbol for a mid-rule action
+with no semantical value. @xref{Mid-Rule Action Translation}.
@end deffn
@deffn {Variable} $$
@end deffn
@deffn {Variable} $@var{name}
-In an action, the semantic value of a symbol addressed by name.
-@xref{Actions}.
-@end deffn
-
-@deffn {Variable} $[@var{name}]
-In an action, the semantic value of a symbol addressed by name.
+@deffnx {Variable} $[@var{name}]
+In an action, the semantic value of a symbol addressed by @var{name}.
@xref{Actions}.
@end deffn
feature.
@end deffn
-@deffn {Construct} /*@dots{}*/
-Comment delimiters, as in C.
+@deffn {Construct} /* @dots{} */
+@deffnx {Construct} // @dots{}
+Comments, as in C/C++.
@end deffn
@deffn {Delimiter} :
@deffn {Directive} %define @var{variable}
@deffnx {Directive} %define @var{variable} @var{value}
+@deffnx {Directive} %define @var{variable} @{@var{value}@}
@deffnx {Directive} %define @var{variable} "@var{value}"
Define a variable to adjust Bison's behavior. @xref{%define Summary}.
@end deffn
GLR Parsers}.
@end deffn
+@deffn {Directive} %empty
+Bison declaration to declare make explicit that a rule has an empty
+right-hand side. @xref{Empty Rules}.
+@end deffn
+
@deffn {Symbol} $end
The predefined token marking the end of the token stream. It cannot be
used in the grammar.
@code{yypstate_new} and @code{yypstate_delete} will also be renamed. For
example, if you use @samp{%name-prefix "c_"}, the names become
@code{c_parse}, @code{c_lex}, and so on. For C++ parsers, see the
-@code{%define namespace} documentation in this section.
+@code{%define api.namespace} documentation in this section.
@end deffn
@deffn {Directive} %union
Bison declaration to specify several possible data types for semantic
-values. @xref{Union Decl, ,The Collection of Value Types}.
+values. @xref{Union Decl, ,The Union Declaration}.
@end deffn
@deffn {Macro} YYABORT
@code{yylex}}.
@end deffn
-@deffn {Macro} YYLEX_PARAM
-An obsolete macro for specifying an extra argument (or list of extra
-arguments) for @code{yyparse} to pass to @code{yylex}. The use of this
-macro is deprecated, and is supported only for Yacc like parsers.
-@xref{Pure Calling,, Calling Conventions for Pure Parsers}.
-@end deffn
-
@deffn {Variable} yylloc
External variable in which @code{yylex} should place the line and column
numbers associated with a token. (In a pure parser, it is a local
@deffn {Variable} yynerrs
Global variable which Bison increments each time it reports a syntax error.
(In a pure parser, it is a local variable within @code{yyparse}. In a
-pure push parser, it is a member of yypstate.)
+pure push parser, it is a member of @code{yypstate}.)
@xref{Error Reporting, ,The Error Reporting Function @code{yyerror}}.
@end deffn
@end deffn
@deffn {Type} YYSTYPE
+Deprecated in favor of the @code{%define} variable @code{api.value.type}.
Data type of semantic values; @code{int} by default.
@xref{Value Type, ,Data Types of Semantic Values}.
@end deffn
@item Accepting state
A state whose only action is the accept action.
The accepting state is thus a consistent state.
-@xref{Understanding,,}.
+@xref{Understanding, ,Understanding Your Parser}.
@item Backus-Naur Form (BNF; also called ``Backus Normal Form'')
Formal method of specifying context-free grammars originally proposed