Here is a simple C function subdivided into tokens:
+@ifinfo
@example
int /* @r{keyword `int'} */
-square (x) /* @r{identifier, open-paren,} */
- /* @r{identifier, close-paren} */
- int x; /* @r{keyword `int', identifier, semicolon} */
+square (int x) /* @r{identifier, open-paren, identifier,}
+ @r{identifier, close-paren} */
@{ /* @r{open-brace} */
- return x * x; /* @r{keyword `return', identifier,} */
- /* @r{asterisk, identifier, semicolon} */
+ return x * x; /* @r{keyword `return', identifier, asterisk,
+ identifier, semicolon} */
@} /* @r{close-brace} */
@end example
+@end ifinfo
+@ifnotinfo
+@example
+int /* @r{keyword `int'} */
+square (int x) /* @r{identifier, open-paren, identifier, identifier, close-paren} */
+@{ /* @r{open-brace} */
+ return x * x; /* @r{keyword `return', identifier, asterisk, identifier, semicolon} */
+@} /* @r{close-brace} */
+@end example
+@end ifnotinfo
The syntactic groupings of C include the expression, the statement, the
declaration, and the function definition. These are represented in the
@xref{Interface, ,Parser C-Language Interface}.
Aside from the token type names and the symbols in the actions you
-write, all variable and function names used in the Bison parser file
+write, all symbols defined in the Bison parser file itself
begin with @samp{yy} or @samp{YY}. This includes interface functions
such as the lexical analyzer function @code{yylex}, the error reporting
function @code{yyerror} and the parser function @code{yyparse} itself.
or @samp{YY} in the Bison grammar file except for the ones defined in
this manual.
+In some cases the Bison parser file includes system headers, and in
+those cases your code should respect the identifiers reserved by those
+headers. On some non-@sc{gnu} hosts, @code{<alloca.h>},
+@code{<stddef.h>}, and @code{<stdlib.h>} are included as needed to
+declare memory allocators and related types. In the same situation,
+C++ parsers may include @code{<cstddef>} and @code{<cstdlib>} instead.
+Other system headers may be included if you define @code{YYDEBUG}
+(@pxref{Debugging, ,Debugging Your Parser}).
+
@node Stages
@section Stages in Using Bison
@cindex stages in using Bison
preprocessor directives.
The @code{#define} directive defines the macro @code{YYSTYPE}, thus
-specifying the C data type for semantic values of both tokens and groupings
-(@pxref{Value Type, ,Data Types of Semantic Values}). The Bison parser will use whatever type
-@code{YYSTYPE} is defined as; if you don't define it, @code{int} is the
-default. Because we specify @code{double}, each token and each expression
-has an associated value, which is a floating point number.
+specifying the C data type for semantic values of both tokens and
+groupings (@pxref{Value Type, ,Data Types of Semantic Values}). The
+Bison parser will use whatever type @code{YYSTYPE} is defined as; if you
+don't define it, @code{int} is the default. Because we specify
+@code{double}, each token and each expression has an associated value,
+which is a floating point number.
The @code{#include} directive is used to declare the exponentiation
function @code{pow}.
macro whose definition is the appropriate number. In this example,
therefore, @code{NUM} becomes a macro for @code{yylex} to use.
-The semantic value of the token (if it has one) is stored into the global
-variable @code{yylval}, which is where the Bison parser will look for it.
-(The C data type of @code{yylval} is @code{YYSTYPE}, which was defined
-at the beginning of the grammar; @pxref{Rpcalc Decls, ,Declarations for @code{rpcalc}}.)
+The semantic value of the token (if it has one) is stored into the
+global variable @code{yylval}, which is where the Bison parser will look
+for it. (The C data type of @code{yylval} is @code{YYSTYPE}, which was
+defined at the beginning of the grammar; @pxref{Rpcalc Decls,
+,Declarations for @code{rpcalc}}.)
A token type code of zero is returned if the end-of-file is encountered.
(Bison recognizes any nonpositive value as indicating the end of the
@example
@group
# @r{List files in current directory.}
-% ls
+$ @kbd{ls}
rpcalc.tab.c rpcalc.y
@end group
@group
# @r{Compile the Bison parser.}
# @r{@samp{-lm} tells compiler to search math library for @code{pow}.}
-% cc rpcalc.tab.c -lm -o rpcalc
+$ @kbd{cc rpcalc.tab.c -lm -o rpcalc}
@end group
@group
# @r{List files again.}
-% ls
+$ @kbd{ls}
rpcalc rpcalc.tab.c rpcalc.y
@end group
@end example
example session using @code{rpcalc}.
@example
-% rpcalc
-4 9 +
+$ @kbd{rpcalc}
+@kbd{4 9 +}
13
-3 7 + 3 4 5 *+-
+@kbd{3 7 + 3 4 5 *+-}
-13
-3 7 + 3 4 5 * + - n @r{Note the unary minus, @samp{n}}
+@kbd{3 7 + 3 4 5 * + - n} @r{Note the unary minus, @samp{n}}
13
-5 6 / 4 n +
+@kbd{5 6 / 4 n +}
-3.166666667
-3 4 ^ @r{Exponentiation}
+@kbd{3 4 ^} @r{Exponentiation}
81
-^D @r{End-of-file indicator}
-%
+@kbd{^D} @r{End-of-file indicator}
+$
@end example
@node Infix Calc
@need 500
@example
-% calc
-4 + 4.5 - (34/(8*3+-3))
+$ @kbd{calc}
+@kbd{4 + 4.5 - (34/(8*3+-3))}
6.880952381
--56 + 2
+@kbd{-56 + 2}
-54
-3 ^ 2
+@kbd{3 ^ 2}
9
@end example
@cindex @code{ltcalc}
@cindex calculator, location tracking
-This example extends the infix notation calculator with location tracking.
-This feature will be used to improve error reporting, and provide better
-error messages.
-
-For the sake of clarity, we will switch for this example to an integer
-calculator, since most of the work needed to use locations will be done
-in the lexical analyser.
+This example extends the infix notation calculator with location
+tracking. This feature will be used to improve the error messages. For
+the sake of clarity, this example is a simple integer calculator, since
+most of the work needed to use locations will be done in the lexical
+analyser.
@menu
* Decls: Ltcalc Decls. Bison and C declarations for ltcalc.
@node Ltcalc Decls
@subsection Declarations for @code{ltcalc}
-The C and Bison declarations for the location tracking calculator are the same
-as the declarations for the infix notation calculator.
+The C and Bison declarations for the location tracking calculator are
+the same as the declarations for the infix notation calculator.
@example
/* Location tracking calculator. */
%% /* Grammar follows */
@end example
-In the code above, there are no declarations specific to locations. Defining
-a data type for storing locations is not needed: we will use the type provided
-by default (@pxref{Location Type, ,Data Types of Locations}), which is a four
-member structure with the following integer fields: @code{first_line},
-@code{first_column}, @code{last_line} and @code{last_column}.
+@noindent
+Note there are no declarations specific to locations. Defining a data
+type for storing locations is not needed: we will use the type provided
+by default (@pxref{Location Type, ,Data Types of Locations}), which is a
+four member structure with the following integer fields:
+@code{first_line}, @code{first_column}, @code{last_line} and
+@code{last_column}.
@node Ltcalc Rules
@subsection Grammar Rules for @code{ltcalc}
-Whether you choose to handle locations or not has no effect on the syntax of
-your language. Therefore, grammar rules for this example will be very close to
-those of the previous example: we will only modify them to benefit from the new
-informations we will have.
+Whether handling locations or not has no effect on the syntax of your
+language. Therefore, grammar rules for this example will be very close
+to those of the previous example: we will only modify them to benefit
+from the new information.
-Here, we will use locations to report divisions by zero, and locate the wrong
-expressions or subexpressions.
+Here, we will use locations to report divisions by zero, and locate the
+wrong expressions or subexpressions.
@example
@group
| exp '-' exp @{ $$ = $1 - $3; @}
| exp '*' exp @{ $$ = $1 * $3; @}
@end group
- | exp '/' exp
@group
+ | exp '/' exp
@{
if ($3)
$$ = $1 / $3;
else
@{
$$ = 1;
- printf("Division by zero, l%d,c%d-l%d,c%d",
- @@3.first_line, @@3.first_column,
- @@3.last_line, @@3.last_column);
+ fprintf (stderr, "%d.%d-%d.%d: division by zero",
+ @@3.first_line, @@3.first_column,
+ @@3.last_line, @@3.last_column);
@}
@}
@end group
using the pseudo-variables @code{@@@var{n}} for rule components, and the
pseudo-variable @code{@@$} for groupings.
-In this example, we never assign a value to @code{@@$}, because the
-output parser can do this automatically. By default, before executing
-the C code of each action, @code{@@$} is set to range from the beginning
-of @code{@@1} to the end of @code{@@@var{n}}, for a rule with @var{n}
-components.
-
-Of course, this behavior can be redefined (@pxref{Location Default
-Action, , Default Action for Locations}), and for very specific rules,
-@code{@@$} can be computed by hand.
+We don't need to assign a value to @code{@@$}: the output parser does it
+automatically. By default, before executing the C code of each action,
+@code{@@$} is set to range from the beginning of @code{@@1} to the end
+of @code{@@@var{n}}, for a rule with @var{n} components. This behavior
+can be redefined (@pxref{Location Default Action, , Default Action for
+Locations}), and for very specific rules, @code{@@$} can be computed by
+hand.
@node Ltcalc Lexer
@subsection The @code{ltcalc} Lexical Analyzer.
-Until now, we relied on Bison's defaults to enable location tracking. The next
-step is to rewrite the lexical analyser, and make it able to feed the parser
-with locations of tokens, as he already does for semantic values.
+Until now, we relied on Bison's defaults to enable location
+tracking. The next step is to rewrite the lexical analyser, and make it
+able to feed the parser with the token locations, as it already does for
+semantic values.
-To do so, we must take into account every single character of the input text,
-to avoid the computed locations of being fuzzy or wrong:
+To this end, we must take into account every single character of the
+input text, to avoid the computed locations of being fuzzy or wrong:
@example
@group
@}
@end example
-Basically, the lexical analyzer does the same processing as before: it skips
-blanks and tabs, and reads numbers or single-character tokens. In addition
-to this, it updates the @code{yylloc} global variable (of type @code{YYLTYPE}),
-where the location of tokens is stored.
+Basically, the lexical analyzer performs the same processing as before:
+it skips blanks and tabs, and reads numbers or single-character tokens.
+In addition, it updates @code{yylloc}, the global variable (of type
+@code{YYLTYPE}) containing the token's location.
-Now, each time this function returns a token, the parser has it's number as
-well as it's semantic value, and it's position in the text. The last needed
-change is to initialize @code{yylloc}, for example in the controlling
-function:
+Now, each time this function returns a token, the parser has its number
+as well as its semantic value, and its location in the text. The last
+needed change is to initialize @code{yylloc}, for example in the
+controlling function:
@example
+@group
int
main (void)
@{
yylloc.first_column = yylloc.last_column = 0;
return yyparse ();
@}
+@end group
@end example
-Remember that computing locations is not a matter of syntax. Every character
-must be associated to a location update, whether it is in valid input, in
-comments, in literal strings, and so on...
+Remember that computing locations is not a matter of syntax. Every
+character must be associated to a location update, whether it is in
+valid input, in comments, in literal strings, and so on.
@node Multi-function Calc
@section Multi-Function Calculator: @code{mfcalc}
Here is a sample session with the multi-function calculator:
@example
-% mfcalc
-pi = 3.141592653589
+$ @kbd{mfcalc}
+@kbd{pi = 3.141592653589}
3.1415926536
-sin(pi)
+@kbd{sin(pi)}
0.0000000000
-alpha = beta1 = 2.3
+@kbd{alpha = beta1 = 2.3}
2.3000000000
-alpha
+@kbd{alpha}
2.3000000000
-ln(alpha)
+@kbd{ln(alpha)}
0.8329091229
-exp(ln(beta1))
+@kbd{exp(ln(beta1))}
2.3000000000
-%
+$
@end example
Note that multiple assignment and nested function calls are permitted.
In a simple program it may be sufficient to use the same data type for
the semantic values of all language constructs. This was true in the
-RPN and infix calculator examples (@pxref{RPN Calc, ,Reverse Polish Notation Calculator}).
+RPN and infix calculator examples (@pxref{RPN Calc, ,Reverse Polish
+Notation Calculator}).
Bison's default is to use type @code{int} for all semantic values. To
specify some other type, define @code{YYSTYPE} as a macro, like this:
@findex %expect
Bison normally warns if there are any conflicts in the grammar
-(@pxref{Shift/Reduce, ,Shift/Reduce Conflicts}), but most real grammars have harmless shift/reduce
-conflicts which are resolved in a predictable way and would be difficult to
-eliminate. It is desirable to suppress the warning about these conflicts
-unless the number of conflicts changes. You can do this with the
-@code{%expect} declaration.
+(@pxref{Shift/Reduce, ,Shift/Reduce Conflicts}), but most real grammars
+have harmless shift/reduce conflicts which are resolved in a predictable
+way and would be difficult to eliminate. It is desirable to suppress
+the warning about these conflicts unless the number of conflicts
+changes. You can do this with the @code{%expect} declaration.
The declaration looks like this:
%expect @var{n}
@end example
-Here @var{n} is a decimal integer. The declaration says there should be no
-warning if there are @var{n} shift/reduce conflicts and no reduce/reduce
-conflicts. The usual warning is given if there are either more or fewer
-conflicts, or if there are any reduce/reduce conflicts.
+Here @var{n} is a decimal integer. The declaration says there should be
+no warning if there are @var{n} shift/reduce conflicts and no
+reduce/reduce conflicts. An error, instead of the usual warning, is
+given if there are either more or fewer conflicts, or if there are any
+reduce/reduce conflicts.
In general, using @code{%expect} involves these steps:
@samp{-t} option when you run Bison (@pxref{Invocation, ,Invoking Bison}).
We always define @code{YYDEBUG} so that debugging is always possible.
-The trace facility uses @code{stderr}, so you must add
-@w{@code{#include <stdio.h>}} to the prologue unless it is already there.
+The trace facility outputs messages with macro calls of the form
+@code{YYFPRINTF (YYSTDERR, @var{format}, @var{args})} where
+@var{format} and @var{args} are the usual @code{printf} format and
+arguments. If you define @code{YYDEBUG} but do not define
+@code{YYFPRINTF}, @code{<stdio.h>} is automatically included and the
+macros are defined to @code{fprintf} and @code{stderr}. In the same
+situation, C++ parsers include @code{<cstdio.h>} instead, and use
+@code{std::fprintf} and @code{std::stderr}.
Once you have compiled the program with trace facilities, the way to
request a trace is to store a nonzero value in the variable @code{yydebug}.
@samp{.y}. The parser file's name is made by replacing the @samp{.y}
with @samp{.tab.c}. Thus, the @samp{bison foo.y} filename yields
@file{foo.tab.c}, and the @samp{bison hack/foo.y} filename yields
-@file{hack/foo.tab.c}. It's is also possible, in case you are writting
+@file{hack/foo.tab.c}. It's is also possible, in case you are writing
C++ code instead of C in your grammar file, to name it @file{foo.ypp}
or @file{foo.y++}. Then, the output files will take an extention like
the given one as input (repectively @file{foo.tab.cpp} and @file{foo.tab.c++}).