* Value Type:: Specifying one data type for all semantic values.
* Multiple Types:: Specifying several alternative data types.
+* Type Generation:: Generating the semantic value type.
+* Union Decl:: Declaring the set of all semantic value types.
+* Structured Value Type:: Providing a structured semantic value type.
* Actions:: An action is the semantic definition of a grammar rule.
* Action Types:: Specifying data types for actions to operate on.
* Mid-Rule Actions:: Most actions go at the end of a rule.
* Require Decl:: Requiring a Bison version.
* Token Decl:: Declaring terminal symbols.
* Precedence Decl:: Declaring terminals with precedence and associativity.
-* Union Decl:: Declaring the set of all semantic value types.
* Type Decl:: Declaring the choice of type for a nonterminal symbol.
* Initial Action Decl:: Code run before parsing starts.
* Destructor Decl:: Declaring how symbols are freed.
@group
%@{
- #define YYSTYPE double
#include <stdio.h>
#include <math.h>
int yylex (void);
%@}
@end group
+%define api.value.type @{double@}
%token NUM
%% /* Grammar rules and actions follow. */
The declarations section (@pxref{Prologue, , The prologue}) contains two
preprocessor directives and two forward declarations.
-The @code{#define} directive defines the macro @code{YYSTYPE}, thus
-specifying the C data type for semantic values of both tokens and
-groupings (@pxref{Value Type, ,Data Types of Semantic Values}). The
-Bison parser will use whatever type @code{YYSTYPE} is defined as; if you
-don't define it, @code{int} is the default. Because we specify
-@code{double}, each token and each expression has an associated value,
-which is a floating point number.
-
The @code{#include} directive is used to declare the exponentiation
function @code{pow}.
epilogue, but the parser calls them so they must be declared in the
prologue.
-The second section, Bison declarations, provides information to Bison
-about the token types (@pxref{Bison Declarations, ,The Bison
-Declarations Section}). Each terminal symbol that is not a
-single-character literal must be declared here. (Single-character
-literals normally don't need to be declared.) In this example, all the
-arithmetic operators are designated by single-character literals, so the
-only terminal symbol that needs to be declared is @code{NUM}, the token
-type for numeric constants.
+The second section, Bison declarations, provides information to Bison about
+the tokens and their types (@pxref{Bison Declarations, ,The Bison
+Declarations Section}).
+
+The @code{%define} directive defines the variable @code{api.value.type},
+thus specifying the C data type for semantic values of both tokens and
+groupings (@pxref{Value Type, ,Data Types of Semantic Values}). The Bison
+parser will use whatever type @code{api.value.type} is defined as; if you
+don't define it, @code{int} is the default. Because we specify
+@samp{@{double@}}, each token and each expression has an associated value,
+which is a floating point number. C code can use @code{YYSTYPE} to refer to
+the value @code{api.value.type}.
+
+Each terminal symbol that is not a single-character literal must be
+declared. (Single-character literals normally don't need to be declared.)
+In this example, all the arithmetic operators are designated by
+single-character literals, so the only terminal symbol that needs to be
+declared is @code{NUM}, the token type for numeric constants.
@node Rpcalc Rules
@subsection Grammar Rules for @code{rpcalc}
The semantic value of the token (if it has one) is stored into the
global variable @code{yylval}, which is where the Bison parser will look
-for it. (The C data type of @code{yylval} is @code{YYSTYPE}, which was
-defined at the beginning of the grammar; @pxref{Rpcalc Declarations,
-,Declarations for @code{rpcalc}}.)
+for it. (The C data type of @code{yylval} is @code{YYSTYPE}, whose value
+was defined at the beginning of the grammar via @samp{%define api.value.type
+@{double@}}; @pxref{Rpcalc Declarations,,Declarations for @code{rpcalc}}.)
A token type code of zero is returned if the end-of-input is encountered.
(Bison recognizes any nonpositive value as indicating end-of-input.)
@group
%@{
- #define YYSTYPE double
#include <math.h>
#include <stdio.h>
int yylex (void);
@group
/* Bison declarations. */
+%define api.value.type @{double@}
%token NUM
%left '-' '+'
%left '*' '/'
/* Location tracking calculator. */
%@{
- #define YYSTYPE int
#include <math.h>
int yylex (void);
void yyerror (char const *);
%@}
/* Bison declarations. */
+%define api.value.type int
%token NUM
%left '-' '+'
%@}
@end group
-@group
-%union @{
- double val; /* For returning numbers. */
- symrec *tptr; /* For returning symbol-table pointers. */
-@}
-@end group
-%token <val> NUM /* Simple double precision number. */
-%token <tptr> VAR FNCT /* Variable and function. */
-%type <val> exp
+%define api.value.type union /* Generate YYSTYPE from these types: */
+%token <double> NUM /* Simple double precision number. */
+%token <symrec*> VAR FNCT /* Symbol table pointer: variable and function. */
+%type <double> exp
@group
%precedence '='
These features allow semantic values to have various data types
(@pxref{Multiple Types, ,More Than One Value Type}).
-The @code{%union} declaration specifies the entire list of possible types;
-this is instead of defining @code{YYSTYPE}. The allowable types are now
-double-floats (for @code{exp} and @code{NUM}) and pointers to entries in
-the symbol table. @xref{Union Decl, ,The Collection of Value Types}.
-
-Since values can now have various types, it is necessary to associate a
-type with each grammar symbol whose semantic value is used. These symbols
-are @code{NUM}, @code{VAR}, @code{FNCT}, and @code{exp}. Their
-declarations are augmented with information about their data type (placed
-between angle brackets).
-
-The Bison construct @code{%type} is used for declaring nonterminal
-symbols, just as @code{%token} is used for declaring token types. We
-have not used @code{%type} before because nonterminal symbols are
-normally declared implicitly by the rules that define them. But
-@code{exp} must be declared explicitly so we can specify its value type.
-@xref{Type Decl, ,Nonterminal Symbols}.
+The special @code{union} value assigned to the @code{%define} variable
+@code{api.value.type} specifies that the symbols are defined with their data
+types. Bison will generate an appropriate definition of @code{YYSTYPE} to
+store these values.
+
+Since values can now have various types, it is necessary to associate a type
+with each grammar symbol whose semantic value is used. These symbols are
+@code{NUM}, @code{VAR}, @code{FNCT}, and @code{exp}. Their declarations are
+augmented with their data type (placed between angle brackets). For
+instance, values of @code{NUM} are stored in @code{double}.
+
+The Bison construct @code{%type} is used for declaring nonterminal symbols,
+just as @code{%token} is used for declaring token types. Previously we did
+not use @code{%type} before because nonterminal symbols are normally
+declared implicitly by the rules that define them. But @code{exp} must be
+declared explicitly so we can specify its value type. @xref{Type Decl,
+,Nonterminal Symbols}.
@node Mfcalc Rules
@subsection Grammar Rules for @code{mfcalc}
if (c == '.' || isdigit (c))
@{
ungetc (c, stdin);
- scanf ("%lf", &yylval.val);
+ scanf ("%lf", &yylval.NUM);
return NUM;
@}
@end group
+@end example
+
+@noindent
+Bison generated a definition of @code{YYSTYPE} with a member named
+@code{NUM} to store value of @code{NUM} symbols.
+@comment file: mfcalc.y: 3
+@example
@group
/* Char starts an identifier => read the name. */
if (isalpha (c))
s = getsym (symbuf);
if (s == 0)
s = putsym (symbuf, VAR);
- yylval.tptr = s;
+ *((symrec**) &yylval) = s;
return s->type;
@}
@menu
* Value Type:: Specifying one data type for all semantic values.
* Multiple Types:: Specifying several alternative data types.
+* Type Generation:: Generating the semantic value type.
+* Union Decl:: Declaring the set of all semantic value types.
+* Structured Value Type:: Providing a structured semantic value type.
* Actions:: An action is the semantic definition of a grammar rule.
* Action Types:: Specifying data types for actions to operate on.
* Mid-Rule Actions:: Most actions go at the end of a rule.
Bison normally uses the type @code{int} for semantic values if your
program uses the same data type for all language constructs. To
-specify some other type, define @code{YYSTYPE} as a macro, like this:
+specify some other type, define the @code{%define} variable
+@code{api.value.type} like this:
+
+@example
+%define api.value.type @{double@}
+@end example
+
+@noindent
+or
+
+@example
+%define api.value.type @{struct semantic_type@}
+@end example
+
+The value of @code{api.value.type} should be a type name that does not
+contain parentheses or square brackets.
+
+Alternatively, instead of relying of Bison's @code{%define} support, you may
+rely on the C/C++ preprocessor and define @code{YYSTYPE} as a macro, like
+this:
@example
#define YYSTYPE double
@end example
@noindent
-@code{YYSTYPE}'s replacement list should be a type name
-that does not contain parentheses or square brackets.
This macro definition must go in the prologue of the grammar file
-(@pxref{Grammar Outline, ,Outline of a Bison Grammar}).
+(@pxref{Grammar Outline, ,Outline of a Bison Grammar}). If compatibility
+with POSIX Yacc matters to you, use this. Note however that Bison cannot
+know @code{YYSTYPE}'s value, not even whether it is defined, so there are
+services it cannot provide. Besides this works only for languages that have
+a preprocessor.
@node Multiple Types
@subsection More Than One Value Type
@itemize @bullet
@item
-Specify the entire collection of possible data types, either by using the
-@code{%union} Bison declaration (@pxref{Union Decl, ,The Collection of
-Value Types}), or by using a @code{typedef} or a @code{#define} to
-define @code{YYSTYPE} to be a union type whose member names are
-the type tags.
+Specify the entire collection of possible data types. There are several
+options:
+@itemize @bullet
+@item
+let Bison compute the union type from the tags you assign to symbols;
+
+@item
+use the @code{%union} Bison declaration (@pxref{Union Decl, ,The Union
+Declaration});
+
+@item
+define the @code{%define} variable @code{api.value.type} to be a union type
+whose members are the type tags (@pxref{Structured Value Type,, Providing a
+Structured Semantic Value Type});
+
+@item
+use a @code{typedef} or a @code{#define} to define @code{YYSTYPE} to be a
+union type whose member names are the type tags.
+@end itemize
@item
Choose one of those types for each symbol (terminal or nonterminal) for
Decl, ,Nonterminal Symbols}).
@end itemize
+@node Type Generation
+@subsection Generating the Semantic Value Type
+@cindex declaring value types
+@cindex value types, declaring
+@findex %define api.value.type union
+
+The special value @code{union} of the @code{%define} variable
+@code{api.value.type} instructs Bison that the tags used with the
+@code{%token} and @code{%type} directives are genuine types, not names of
+members of @code{YYSTYPE}.
+
+For example:
+
+@example
+%define api.value.type union
+%token <int> INT "integer"
+%token <int> 'n'
+%type <int> expr
+%token <char const *> ID "identifier"
+@end example
+
+@noindent
+generates an appropriate value of @code{YYSTYPE} to support each symbol
+type. The name of the member of @code{YYSTYPE} for tokens than have a
+declared identifier @var{id} (such as @code{INT} and @code{ID} above, but
+not @code{'n'}) is @code{@var{id}}. The other symbols have unspecified
+names on which you should not depend; instead, relying on C casts to access
+the semantic value with the appropriate type:
+
+@example
+/* For an "integer". */
+yylval.INT = 42;
+return INT;
+
+/* For an 'n', also declared as int. */
+*((int*)&yylval) = 42;
+return 'n';
+
+/* For an "identifier". */
+yylval.ID = "42";
+return ID;
+@end example
+
+If the @code{%define} variable @code{api.token.prefix} is defined
+(@pxref{%define Summary,,api.token.prefix}), then it is also used to prefix
+the union member names. For instance, with @samp{%define api.token.prefix
+TOK_}:
+
+@example
+/* For an "integer". */
+yylval.TOK_INT = 42;
+return TOK_INT;
+@end example
+
+This Bison extension cannot work if @code{%yacc} (or
+@option{-y}/@option{--yacc}) is enabled, as POSIX mandates that Yacc
+generate tokens as macros (e.g., @samp{#define INT 258}, or @samp{#define
+TOK_INT 258}).
+
+This feature is new, and user feedback would be most welcome.
+
+A similar feature is provided for C++ that in addition overcomes C++
+limitations (that forbid non-trivial objects to be part of a @code{union}):
+@samp{%define api.value.type variant}, see @ref{C++ Variants}.
+
+@node Union Decl
+@subsection The Union Declaration
+@cindex declaring value types
+@cindex value types, declaring
+@findex %union
+
+The @code{%union} declaration specifies the entire collection of possible
+data types for semantic values. The keyword @code{%union} is followed by
+braced code containing the same thing that goes inside a @code{union} in C@.
+
+For example:
+
+@example
+@group
+%union @{
+ double val;
+ symrec *tptr;
+@}
+@end group
+@end example
+
+@noindent
+This says that the two alternative types are @code{double} and @code{symrec
+*}. They are given names @code{val} and @code{tptr}; these names are used
+in the @code{%token} and @code{%type} declarations to pick one of the types
+for a terminal or nonterminal symbol (@pxref{Type Decl, ,Nonterminal Symbols}).
+
+As an extension to POSIX, a tag is allowed after the @code{%union}. For
+example:
+
+@example
+@group
+%union value @{
+ double val;
+ symrec *tptr;
+@}
+@end group
+@end example
+
+@noindent
+specifies the union tag @code{value}, so the corresponding C type is
+@code{union value}. If you do not specify a tag, it defaults to
+@code{YYSTYPE}.
+
+As another extension to POSIX, you may specify multiple @code{%union}
+declarations; their contents are concatenated. However, only the first
+@code{%union} declaration can specify a tag.
+
+Note that, unlike making a @code{union} declaration in C, you need not write
+a semicolon after the closing brace.
+
+@node Structured Value Type
+@subsection Providing a Structured Semantic Value Type
+@cindex declaring value types
+@cindex value types, declaring
+@findex %union
+
+Instead of @code{%union}, you can define and use your own union type
+@code{YYSTYPE} if your grammar contains at least one @samp{<@var{type}>}
+tag. For example, you can put the following into a header file
+@file{parser.h}:
+
+@example
+@group
+union YYSTYPE @{
+ double val;
+ symrec *tptr;
+@};
+@end group
+@end example
+
+@noindent
+and then your grammar can use the following instead of @code{%union}:
+
+@example
+@group
+%@{
+#include "parser.h"
+%@}
+%define api.value.type "union YYSTYPE"
+%type <val> expr
+%token <tptr> ID
+@end group
+@end example
+
+Actually, you may also provide a @code{struct} rather that a @code{union},
+which may be handy if you want to track information for every symbol (such
+as preceding comments).
+
+The type you provide may even be structured and include pointers, in which
+case the type tags you provide may be composite, with @samp{.} and @samp{->}
+operators.
+
@node Actions
@subsection Actions
@cindex action
else
@{
$$ = 1;
- fprintf (stderr,
- "Division by zero, l%d,c%d-l%d,c%d",
+ fprintf (stderr, "%d.%d-%d.%d: division by zero",
@@3.first_line, @@3.first_column,
@@3.last_line, @@3.last_column);
@}
else
@{
$$ = 1;
- fprintf (stderr,
- "Division by zero, l%d,c%d-l%d,c%d",
+ fprintf (stderr, "%d.%d-%d.%d: division by zero",
@@3.first_line, @@3.first_column,
@@3.last_line, @@3.last_column);
@}
* Require Decl:: Requiring a Bison version.
* Token Decl:: Declaring terminal symbols.
* Precedence Decl:: Declaring terminals with precedence and associativity.
-* Union Decl:: Declaring the set of all semantic value types.
* Type Decl:: Declaring the choice of type for a nonterminal symbol.
* Initial Action Decl:: Code run before parsing starts.
* Destructor Decl:: Declaring how symbols are freed.
%left OR 134 "<=" 135 // Declares 134 for OR and 135 for "<=".
@end example
-@node Union Decl
-@subsection The Collection of Value Types
-@cindex declaring value types
-@cindex value types, declaring
-@findex %union
-
-The @code{%union} declaration specifies the entire collection of
-possible data types for semantic values. The keyword @code{%union} is
-followed by braced code containing the same thing that goes inside a
-@code{union} in C@.
-
-For example:
-
-@example
-@group
-%union @{
- double val;
- symrec *tptr;
-@}
-@end group
-@end example
-
-@noindent
-This says that the two alternative types are @code{double} and @code{symrec
-*}. They are given names @code{val} and @code{tptr}; these names are used
-in the @code{%token} and @code{%type} declarations to pick one of the types
-for a terminal or nonterminal symbol (@pxref{Type Decl, ,Nonterminal Symbols}).
-
-As an extension to POSIX, a tag is allowed after the
-@code{union}. For example:
-
-@example
-@group
-%union value @{
- double val;
- symrec *tptr;
-@}
-@end group
-@end example
-
-@noindent
-specifies the union tag @code{value}, so the corresponding C type is
-@code{union value}. If you do not specify a tag, it defaults to
-@code{YYSTYPE}.
-
-As another extension to POSIX, you may specify multiple
-@code{%union} declarations; their contents are concatenated. However,
-only the first @code{%union} declaration can specify a tag.
-
-Note that, unlike making a @code{union} declaration in C, you need not write
-a semicolon after the closing brace.
-
-Instead of @code{%union}, you can define and use your own union type
-@code{YYSTYPE} if your grammar contains at least one
-@samp{<@var{type}>} tag. For example, you can put the following into
-a header file @file{parser.h}:
-
-@example
-@group
-union YYSTYPE @{
- double val;
- symrec *tptr;
-@};
-typedef union YYSTYPE YYSTYPE;
-@end group
-@end example
-
-@noindent
-and then your grammar can use the following
-instead of @code{%union}:
-
-@example
-@group
-%@{
-#include "parser.h"
-%@}
-%type <val> expr
-%token <tptr> ID
-@end group
-@end example
-
@node Type Decl
@subsection Nonterminal Symbols
@cindex declaring value types, nonterminals
@noindent
Here @var{nonterminal} is the name of a nonterminal symbol, and
@var{type} is the name given in the @code{%union} to the alternative
-that you want (@pxref{Union Decl, ,The Collection of Value Types}). You
+that you want (@pxref{Union Decl, ,The Union Declaration}). You
can give any number of nonterminal symbols in the same @code{%type}
declaration, if they have the same value type. Use spaces to separate
the symbol names.
@example
%union @{ char *string; @}
-%token <string> STRING1
-%token <string> STRING2
-%type <string> string1
-%type <string> string2
+%token <string> STRING1 STRING2
+%type <string> string1 string2
%union @{ char character; @}
%token <character> CHR
%type <character> chr
@example
%union @{ char *string; @}
-%token <string> STRING1
-%token <string> STRING2
-%type <string> string1
-%type <string> string2
+%token <string> STRING1 STRING2
+%type <string> string1 string2
%union @{ char character; @}
%token <character> CHR
%type <character> chr
@deffn {Directive} %union
Declare the collection of data types that semantic values may have
-(@pxref{Union Decl, ,The Collection of Value Types}).
+(@pxref{Union Decl, ,The Union Declaration}).
@end deffn
@deffn {Directive} %token
scanner must use these prefixed token names, while the grammar itself
may still use the short names (as in the sample rule given above). The
generated informational files (@file{*.output}, @file{*.xml},
-@file{*.dot}) are not modified by this prefix. See @ref{Calc++ Parser}
-and @ref{Calc++ Scanner}, for a complete example.
+@file{*.dot}) are not modified by this prefix.
+
+Bison also prefixes the generated member names of the semantic value union.
+@xref{Type Generation,, Generating the Semantic Value Type}, for more
+details.
+
+See @ref{Calc++ Parser} and @ref{Calc++ Scanner}, for a complete example.
@item Accepted Values:
Any string. Should be a valid identifier prefix in the target language,
@item @code{%union} (C, C++)
The type is defined thanks to the @code{%union} directive. You don't have
to define @code{api.value.type} in that case, using @code{%union} suffices.
-@xref{Union Decl, ,The Collection of Value Types}.
+@xref{Union Decl, ,The Union Declaration}.
For instance:
@example
%define api.value.type "%union"
@item Language(s): C, C++
@item Purpose: This is the best place to write dependency code required for
-@code{YYSTYPE} and @code{YYLTYPE}.
-In other words, it's the best place to define types referenced in @code{%union}
-directives, and it's the best place to override Bison's default @code{YYSTYPE}
-and @code{YYLTYPE} definitions.
+@code{YYSTYPE} and @code{YYLTYPE}. In other words, it's the best place to
+define types referenced in @code{%union} directives. If you use
+@code{#define} to override Bison's default @code{YYSTYPE} and @code{YYLTYPE}
+definitions, then it is also the best place. However you should rather
+@code{%define} @code{api.value.type} and @code{api.location.type}.
@item Location(s): The parser header file and the parser implementation file
before the Bison-generated @code{YYSTYPE} and @code{YYLTYPE}
When you are using multiple data types, @code{yylval}'s type is a union
made from the @code{%union} declaration (@pxref{Union Decl, ,The
-Collection of Value Types}). So when you store a token's value, you
+Union Declaration}). So when you store a token's value, you
must use the proper member of the union. If the @code{%union}
declaration looks like this:
/* Formatting semantic values. */
%printer @{ fprintf (yyoutput, "%s", $$->name); @} VAR;
%printer @{ fprintf (yyoutput, "%s()", $$->name); @} FNCT;
-%printer @{ fprintf (yyoutput, "%g", $$); @} <val>;
+%printer @{ fprintf (yyoutput, "%g", $$); @} <double>;
@end example
The @code{%define} directive instructs Bison to generate run-time trace
The set of @code{%printer} directives demonstrates how to format the
semantic value in the traces. Note that the specification can be done
either on the symbol type (e.g., @code{VAR} or @code{FNCT}), or on the type
-tag: since @code{<val>} is the type for both @code{NUM} and @code{exp}, this
-printer will be used for them.
+tag: since @code{<double>} is the type for both @code{NUM} and @code{exp},
+this printer will be used for them.
Here is a sample of the information provided by run-time traces. The traces
are sent onto standard error.
@noindent
The previous reduction demonstrates the @code{%printer} directive for
-@code{<val>}: both the token @code{NUM} and the resulting nonterminal
+@code{<double>}: both the token @code{NUM} and the resulting nonterminal
@code{exp} have @samp{1} as value.
@example
@subsubsection C++ Unions
The @code{%union} directive works as for C, see @ref{Union Decl, ,The
-Collection of Value Types}. In particular it produces a genuine
+Union Declaration}. In particular it produces a genuine
@code{union}, which have a few specific features in C++.
@itemize @minus
@item
@deffn {Directive} %union
Bison declaration to specify several possible data types for semantic
-values. @xref{Union Decl, ,The Collection of Value Types}.
+values. @xref{Union Decl, ,The Union Declaration}.
@end deffn
@deffn {Macro} YYABORT
@end deffn
@deffn {Type} YYSTYPE
+Deprecated in favor of the @code{%define} variable @code{api.value.type}.
Data type of semantic values; @code{int} by default.
@xref{Value Type, ,Data Types of Semantic Values}.
@end deffn