* Value Type:: Specifying one data type for all semantic values.
* Multiple Types:: Specifying several alternative data types.
+* Type Generation:: Generating the semantic value type.
+* Union Decl:: Declaring the set of all semantic value types.
+* Structured Value Type:: Providing a structured semantic value type.
* Actions:: An action is the semantic definition of a grammar rule.
* Action Types:: Specifying data types for actions to operate on.
* Mid-Rule Actions:: Most actions go at the end of a rule.
* Require Decl:: Requiring a Bison version.
* Token Decl:: Declaring terminal symbols.
* Precedence Decl:: Declaring terminals with precedence and associativity.
-* Union Decl:: Declaring the set of all semantic value types.
* Type Decl:: Declaring the choice of type for a nonterminal symbol.
* Initial Action Decl:: Code run before parsing starts.
* Destructor Decl:: Declaring how symbols are freed.
%@}
@end group
-%define api.value.type double
+%define api.value.type @{double@}
%token NUM
%% /* Grammar rules and actions follow. */
groupings (@pxref{Value Type, ,Data Types of Semantic Values}). The Bison
parser will use whatever type @code{api.value.type} is defined as; if you
don't define it, @code{int} is the default. Because we specify
-@code{double}, each token and each expression has an associated value, which
-is a floating point number. C code can use @code{YYSTYPE} to refer to the
-value @code{api.value.type}.
+@samp{@{double@}}, each token and each expression has an associated value,
+which is a floating point number. C code can use @code{YYSTYPE} to refer to
+the value @code{api.value.type}.
Each terminal symbol that is not a single-character literal must be
declared. (Single-character literals normally don't need to be declared.)
global variable @code{yylval}, which is where the Bison parser will look
for it. (The C data type of @code{yylval} is @code{YYSTYPE}, whose value
was defined at the beginning of the grammar via @samp{%define api.value.type
-double}; @pxref{Rpcalc Declarations,,Declarations for @code{rpcalc}}.)
+@{double@}}; @pxref{Rpcalc Declarations,,Declarations for @code{rpcalc}}.)
A token type code of zero is returned if the end-of-input is encountered.
(Bison recognizes any nonpositive value as indicating end-of-input.)
@group
/* Bison declarations. */
-%define api.value.type double
+%define api.value.type @{double@}
%token NUM
%left '-' '+'
%left '*' '/'
%@}
@end group
-@group
-%union @{
- double val; /* For returning numbers. */
- symrec *tptr; /* For returning symbol-table pointers. */
-@}
-@end group
-%token <val> NUM /* Simple double precision number. */
-%token <tptr> VAR FNCT /* Variable and function. */
-%type <val> exp
+%define api.value.type union /* Generate YYSTYPE from these types: */
+%token <double> NUM /* Simple double precision number. */
+%token <symrec*> VAR FNCT /* Symbol table pointer: variable and function. */
+%type <double> exp
@group
%precedence '='
These features allow semantic values to have various data types
(@pxref{Multiple Types, ,More Than One Value Type}).
-The @code{%union} declaration specifies the entire list of possible types;
-this is instead of defining @code{api.value.type}. The allowable types are now
-double-floats (for @code{exp} and @code{NUM}) and pointers to entries in
-the symbol table. @xref{Union Decl, ,The Collection of Value Types}.
-
-Since values can now have various types, it is necessary to associate a
-type with each grammar symbol whose semantic value is used. These symbols
-are @code{NUM}, @code{VAR}, @code{FNCT}, and @code{exp}. Their
-declarations are augmented with information about their data type (placed
-between angle brackets).
-
-The Bison construct @code{%type} is used for declaring nonterminal
-symbols, just as @code{%token} is used for declaring token types. We
-have not used @code{%type} before because nonterminal symbols are
-normally declared implicitly by the rules that define them. But
-@code{exp} must be declared explicitly so we can specify its value type.
-@xref{Type Decl, ,Nonterminal Symbols}.
+The special @code{union} value assigned to the @code{%define} variable
+@code{api.value.type} specifies that the symbols are defined with their data
+types. Bison will generate an appropriate definition of @code{YYSTYPE} to
+store these values.
+
+Since values can now have various types, it is necessary to associate a type
+with each grammar symbol whose semantic value is used. These symbols are
+@code{NUM}, @code{VAR}, @code{FNCT}, and @code{exp}. Their declarations are
+augmented with their data type (placed between angle brackets). For
+instance, values of @code{NUM} are stored in @code{double}.
+
+The Bison construct @code{%type} is used for declaring nonterminal symbols,
+just as @code{%token} is used for declaring token types. Previously we did
+not use @code{%type} before because nonterminal symbols are normally
+declared implicitly by the rules that define them. But @code{exp} must be
+declared explicitly so we can specify its value type. @xref{Type Decl,
+,Nonterminal Symbols}.
@node Mfcalc Rules
@subsection Grammar Rules for @code{mfcalc}
if (c == '.' || isdigit (c))
@{
ungetc (c, stdin);
- scanf ("%lf", &yylval.val);
+ scanf ("%lf", &yylval.NUM);
return NUM;
@}
@end group
+@end example
+@noindent
+Bison generated a definition of @code{YYSTYPE} with a member named
+@code{NUM} to store value of @code{NUM} symbols.
+
+@comment file: mfcalc.y: 3
+@example
@group
/* Char starts an identifier => read the name. */
if (isalpha (c))
s = getsym (symbuf);
if (s == 0)
s = putsym (symbuf, VAR);
- yylval.tptr = s;
+ *((symrec**) &yylval) = s;
return s->type;
@}
@menu
* Value Type:: Specifying one data type for all semantic values.
* Multiple Types:: Specifying several alternative data types.
+* Type Generation:: Generating the semantic value type.
+* Union Decl:: Declaring the set of all semantic value types.
+* Structured Value Type:: Providing a structured semantic value type.
* Actions:: An action is the semantic definition of a grammar rule.
* Action Types:: Specifying data types for actions to operate on.
* Mid-Rule Actions:: Most actions go at the end of a rule.
@code{api.value.type} like this:
@example
-%define api.value.type double
+%define api.value.type @{double@}
@end example
@noindent
or
@example
-%define api.value.type "struct semantic_type"
+%define api.value.type @{struct semantic_type@}
@end example
The value of @code{api.value.type} should be a type name that does not
@itemize @bullet
@item
-Specify the entire collection of possible data types, either by using the
-@code{%union} Bison declaration (@pxref{Union Decl, ,The Collection of
-Value Types}), or by using a @code{typedef} or a @code{#define} to
-define @code{YYSTYPE} to be a union type whose member names are
-the type tags.
+Specify the entire collection of possible data types. There are several
+options:
+@itemize @bullet
+@item
+let Bison compute the union type from the tags you assign to symbols;
+
+@item
+use the @code{%union} Bison declaration (@pxref{Union Decl, ,The Union
+Declaration});
+
+@item
+define the @code{%define} variable @code{api.value.type} to be a union type
+whose members are the type tags (@pxref{Structured Value Type,, Providing a
+Structured Semantic Value Type});
+
+@item
+use a @code{typedef} or a @code{#define} to define @code{YYSTYPE} to be a
+union type whose member names are the type tags.
+@end itemize
@item
Choose one of those types for each symbol (terminal or nonterminal) for
Decl, ,Nonterminal Symbols}).
@end itemize
+@node Type Generation
+@subsection Generating the Semantic Value Type
+@cindex declaring value types
+@cindex value types, declaring
+@findex %define api.value.type union
+
+The special value @code{union} of the @code{%define} variable
+@code{api.value.type} instructs Bison that the tags used with the
+@code{%token} and @code{%type} directives are genuine types, not names of
+members of @code{YYSTYPE}.
+
+For example:
+
+@example
+%define api.value.type union
+%token <int> INT "integer"
+%token <int> 'n'
+%type <int> expr
+%token <char const *> ID "identifier"
+@end example
+
+@noindent
+generates an appropriate value of @code{YYSTYPE} to support each symbol
+type. The name of the member of @code{YYSTYPE} for tokens than have a
+declared identifier @var{id} (such as @code{INT} and @code{ID} above, but
+not @code{'n'}) is @code{@var{id}}. The other symbols have unspecified
+names on which you should not depend; instead, relying on C casts to access
+the semantic value with the appropriate type:
+
+@example
+/* For an "integer". */
+yylval.INT = 42;
+return INT;
+
+/* For an 'n', also declared as int. */
+*((int*)&yylval) = 42;
+return 'n';
+
+/* For an "identifier". */
+yylval.ID = "42";
+return ID;
+@end example
+
+If the @code{%define} variable @code{api.token.prefix} is defined
+(@pxref{%define Summary,,api.token.prefix}), then it is also used to prefix
+the union member names. For instance, with @samp{%define api.token.prefix
+@{TOK_@}}:
+
+@example
+/* For an "integer". */
+yylval.TOK_INT = 42;
+return TOK_INT;
+@end example
+
+This Bison extension cannot work if @code{%yacc} (or
+@option{-y}/@option{--yacc}) is enabled, as POSIX mandates that Yacc
+generate tokens as macros (e.g., @samp{#define INT 258}, or @samp{#define
+TOK_INT 258}).
+
+This feature is new, and user feedback would be most welcome.
+
+A similar feature is provided for C++ that in addition overcomes C++
+limitations (that forbid non-trivial objects to be part of a @code{union}):
+@samp{%define api.value.type variant}, see @ref{C++ Variants}.
+
+@node Union Decl
+@subsection The Union Declaration
+@cindex declaring value types
+@cindex value types, declaring
+@findex %union
+
+The @code{%union} declaration specifies the entire collection of possible
+data types for semantic values. The keyword @code{%union} is followed by
+braced code containing the same thing that goes inside a @code{union} in C@.
+
+For example:
+
+@example
+@group
+%union @{
+ double val;
+ symrec *tptr;
+@}
+@end group
+@end example
+
+@noindent
+This says that the two alternative types are @code{double} and @code{symrec
+*}. They are given names @code{val} and @code{tptr}; these names are used
+in the @code{%token} and @code{%type} declarations to pick one of the types
+for a terminal or nonterminal symbol (@pxref{Type Decl, ,Nonterminal Symbols}).
+
+As an extension to POSIX, a tag is allowed after the @code{%union}. For
+example:
+
+@example
+@group
+%union value @{
+ double val;
+ symrec *tptr;
+@}
+@end group
+@end example
+
+@noindent
+specifies the union tag @code{value}, so the corresponding C type is
+@code{union value}. If you do not specify a tag, it defaults to
+@code{YYSTYPE}.
+
+As another extension to POSIX, you may specify multiple @code{%union}
+declarations; their contents are concatenated. However, only the first
+@code{%union} declaration can specify a tag.
+
+Note that, unlike making a @code{union} declaration in C, you need not write
+a semicolon after the closing brace.
+
+@node Structured Value Type
+@subsection Providing a Structured Semantic Value Type
+@cindex declaring value types
+@cindex value types, declaring
+@findex %union
+
+Instead of @code{%union}, you can define and use your own union type
+@code{YYSTYPE} if your grammar contains at least one @samp{<@var{type}>}
+tag. For example, you can put the following into a header file
+@file{parser.h}:
+
+@example
+@group
+union YYSTYPE @{
+ double val;
+ symrec *tptr;
+@};
+@end group
+@end example
+
+@noindent
+and then your grammar can use the following instead of @code{%union}:
+
+@example
+@group
+%@{
+#include "parser.h"
+%@}
+%define api.value.type "union YYSTYPE"
+%type <val> expr
+%token <tptr> ID
+@end group
+@end example
+
+Actually, you may also provide a @code{struct} rather that a @code{union},
+which may be handy if you want to track information for every symbol (such
+as preceding comments).
+
+The type you provide may even be structured and include pointers, in which
+case the type tags you provide may be composite, with @samp{.} and @samp{->}
+operators.
+
@node Actions
@subsection Actions
@cindex action
else
@{
$$ = 1;
- fprintf (stderr,
- "Division by zero, l%d,c%d-l%d,c%d",
+ fprintf (stderr, "%d.%d-%d.%d: division by zero",
@@3.first_line, @@3.first_column,
@@3.last_line, @@3.last_column);
@}
else
@{
$$ = 1;
- fprintf (stderr,
- "Division by zero, l%d,c%d-l%d,c%d",
+ fprintf (stderr, "%d.%d-%d.%d: division by zero",
@@3.first_line, @@3.first_column,
@@3.last_line, @@3.last_column);
@}
* Require Decl:: Requiring a Bison version.
* Token Decl:: Declaring terminal symbols.
* Precedence Decl:: Declaring terminals with precedence and associativity.
-* Union Decl:: Declaring the set of all semantic value types.
* Type Decl:: Declaring the choice of type for a nonterminal symbol.
* Initial Action Decl:: Code run before parsing starts.
* Destructor Decl:: Declaring how symbols are freed.
%left OR 134 "<=" 135 // Declares 134 for OR and 135 for "<=".
@end example
-@node Union Decl
-@subsection The Collection of Value Types
-@cindex declaring value types
-@cindex value types, declaring
-@findex %union
-
-The @code{%union} declaration specifies the entire collection of
-possible data types for semantic values. The keyword @code{%union} is
-followed by braced code containing the same thing that goes inside a
-@code{union} in C@.
-
-For example:
-
-@example
-@group
-%union @{
- double val;
- symrec *tptr;
-@}
-@end group
-@end example
-
-@noindent
-This says that the two alternative types are @code{double} and @code{symrec
-*}. They are given names @code{val} and @code{tptr}; these names are used
-in the @code{%token} and @code{%type} declarations to pick one of the types
-for a terminal or nonterminal symbol (@pxref{Type Decl, ,Nonterminal Symbols}).
-
-As an extension to POSIX, a tag is allowed after the
-@code{union}. For example:
-
-@example
-@group
-%union value @{
- double val;
- symrec *tptr;
-@}
-@end group
-@end example
-
-@noindent
-specifies the union tag @code{value}, so the corresponding C type is
-@code{union value}. If you do not specify a tag, it defaults to
-@code{YYSTYPE}.
-
-As another extension to POSIX, you may specify multiple
-@code{%union} declarations; their contents are concatenated. However,
-only the first @code{%union} declaration can specify a tag.
-
-Note that, unlike making a @code{union} declaration in C, you need not write
-a semicolon after the closing brace.
-
-Instead of @code{%union}, you can define and use your own union type
-@code{YYSTYPE} if your grammar contains at least one
-@samp{<@var{type}>} tag. For example, you can put the following into
-a header file @file{parser.h}:
-
-@example
-@group
-union YYSTYPE @{
- double val;
- symrec *tptr;
-@};
-@end group
-@end example
-
-@noindent
-and then your grammar can use the following instead of @code{%union}:
-
-@example
-@group
-%@{
-#include "parser.h"
-%@}
-%define api.value.type "union YYSTYPE"
-%type <val> expr
-%token <tptr> ID
-@end group
-@end example
-
@node Type Decl
@subsection Nonterminal Symbols
@cindex declaring value types, nonterminals
@noindent
Here @var{nonterminal} is the name of a nonterminal symbol, and
@var{type} is the name given in the @code{%union} to the alternative
-that you want (@pxref{Union Decl, ,The Collection of Value Types}). You
+that you want (@pxref{Union Decl, ,The Union Declaration}). You
can give any number of nonterminal symbols in the same @code{%type}
declaration, if they have the same value type. Use spaces to separate
the symbol names.
@example
%union @{ char *string; @}
-%token <string> STRING1
-%token <string> STRING2
-%type <string> string1
-%type <string> string2
+%token <string> STRING1 STRING2
+%type <string> string1 string2
%union @{ char character; @}
%token <character> CHR
%type <character> chr
@example
%union @{ char *string; @}
-%token <string> STRING1
-%token <string> STRING2
-%type <string> string1
-%type <string> string2
+%token <string> STRING1 STRING2
+%type <string> string1 string2
%union @{ char character; @}
%token <character> CHR
%type <character> chr
@deffn {Directive} %union
Declare the collection of data types that semantic values may have
-(@pxref{Union Decl, ,The Collection of Value Types}).
+(@pxref{Union Decl, ,The Union Declaration}).
@end deffn
@deffn {Directive} %token
@c ================================================== api.token.prefix
-@deffn Directive {%define api.token.prefix} @var{prefix}
+@deffn Directive {%define api.token.prefix} @{@var{prefix}@}
@itemize
@item Languages(s): all
@example
%token FILE for ERROR
-%define api.token.prefix "TOK_"
+%define api.token.prefix @{TOK_@}
%%
start: FILE for ERROR;
@end example
scanner must use these prefixed token names, while the grammar itself
may still use the short names (as in the sample rule given above). The
generated informational files (@file{*.output}, @file{*.xml},
-@file{*.dot}) are not modified by this prefix. See @ref{Calc++ Parser}
-and @ref{Calc++ Scanner}, for a complete example.
+@file{*.dot}) are not modified by this prefix.
+
+Bison also prefixes the generated member names of the semantic value union.
+@xref{Type Generation,, Generating the Semantic Value Type}, for more
+details.
+
+See @ref{Calc++ Parser} and @ref{Calc++ Scanner}, for a complete example.
@item Accepted Values:
Any string. Should be a valid identifier prefix in the target language,
@item Default Value:
empty
@item History:
-introduced in Bison 2.8
+introduced in Bison 3.0
@end itemize
@end deffn
@c api.token.prefix
@item @code{%union} (C, C++)
The type is defined thanks to the @code{%union} directive. You don't have
to define @code{api.value.type} in that case, using @code{%union} suffices.
-@xref{Union Decl, ,The Collection of Value Types}.
+@xref{Union Decl, ,The Union Declaration}.
For instance:
@example
%define api.value.type "%union"
When you are using multiple data types, @code{yylval}'s type is a union
made from the @code{%union} declaration (@pxref{Union Decl, ,The
-Collection of Value Types}). So when you store a token's value, you
+Union Declaration}). So when you store a token's value, you
must use the proper member of the union. If the @code{%union}
declaration looks like this:
/* Formatting semantic values. */
%printer @{ fprintf (yyoutput, "%s", $$->name); @} VAR;
%printer @{ fprintf (yyoutput, "%s()", $$->name); @} FNCT;
-%printer @{ fprintf (yyoutput, "%g", $$); @} <val>;
+%printer @{ fprintf (yyoutput, "%g", $$); @} <double>;
@end example
The @code{%define} directive instructs Bison to generate run-time trace
The set of @code{%printer} directives demonstrates how to format the
semantic value in the traces. Note that the specification can be done
either on the symbol type (e.g., @code{VAR} or @code{FNCT}), or on the type
-tag: since @code{<val>} is the type for both @code{NUM} and @code{exp}, this
-printer will be used for them.
+tag: since @code{<double>} is the type for both @code{NUM} and @code{exp},
+this printer will be used for them.
Here is a sample of the information provided by run-time traces. The traces
are sent onto standard error.
@noindent
The previous reduction demonstrates the @code{%printer} directive for
-@code{<val>}: both the token @code{NUM} and the resulting nonterminal
+@code{<double>}: both the token @code{NUM} and the resulting nonterminal
@code{exp} have @samp{1} as value.
@example
@subsubsection C++ Unions
The @code{%union} directive works as for C, see @ref{Union Decl, ,The
-Collection of Value Types}. In particular it produces a genuine
+Union Declaration}. In particular it produces a genuine
@code{union}, which have a few specific features in C++.
@itemize @minus
@item
The line, starting at 1.
@end deftypeivar
-@deftypemethod {position} {uint} lines (int @var{height} = 1)
-Advance by @var{height} lines, resetting the column number.
+@deftypemethod {position} {void} lines (int @var{height} = 1)
+If @var{height} is not null, advance by @var{height} lines, resetting the
+column number. The resulting line number cannot be less than 1.
@end deftypemethod
@deftypeivar {position} {uint} column
The column, starting at 1.
@end deftypeivar
-@deftypemethod {position} {uint} columns (int @var{width} = 1)
-Advance by @var{width} columns, without changing the line number.
+@deftypemethod {position} {void} columns (int @var{width} = 1)
+Advance by @var{width} columns, without changing the line number. The
+resulting column number cannot be less than 1.
@end deftypemethod
@deftypemethod {position} {position&} operator+= (int @var{width})
The first, inclusive, position of the range, and the first beyond.
@end deftypeivar
-@deftypemethod {location} {uint} columns (int @var{width} = 1)
-@deftypemethodx {location} {uint} lines (int @var{height} = 1)
-Advance the @code{end} position.
+@deftypemethod {location} {void} columns (int @var{width} = 1)
+@deftypemethodx {location} {void} lines (int @var{height} = 1)
+Forwarded to the @code{end} position.
@end deftypemethod
@deftypemethod {location} {location} operator+ (const location& @var{end})
@deftypemethodx {location} {location} operator+ (int @var{width})
@deftypemethodx {location} {location} operator+= (int @var{width})
+@deftypemethodx {location} {location} operator- (int @var{width})
+@deftypemethodx {location} {location} operator-= (int @var{width})
Various forms of syntactic sugar.
@end deftypemethod
For instance, given the following declarations:
@example
-%define api.token.prefix "TOK_"
+%define api.token.prefix @{TOK_@}
%token <std::string> IDENTIFIER;
%token <int> INTEGER;
%token COLON;
@comment file: calc++-parser.yy
@example
-%define api.token.prefix "TOK_"
+%define api.token.prefix @{TOK_@}
%token
END 0 "end of file"
ASSIGN ":="
@deffn {Directive} %union
Bison declaration to specify several possible data types for semantic
-values. @xref{Union Decl, ,The Collection of Value Types}.
+values. @xref{Union Decl, ,The Union Declaration}.
@end deffn
@deffn {Macro} YYABORT