X-Git-Url: https://git.saurik.com/bison.git/blobdiff_plain/d8988b2fffa93194e297001d7488593c9ac47afa..927c15577418c9c18feea508854a3f49bb03aa9f:/doc/bison.texinfo?ds=sidebyside diff --git a/doc/bison.texinfo b/doc/bison.texinfo index 5b930a45..609f870b 100644 --- a/doc/bison.texinfo +++ b/doc/bison.texinfo @@ -447,16 +447,26 @@ lexicography, not grammar.) Here is a simple C function subdivided into tokens: +@ifinfo @example int /* @r{keyword `int'} */ -square (x) /* @r{identifier, open-paren,} */ - /* @r{identifier, close-paren} */ - int x; /* @r{keyword `int', identifier, semicolon} */ +square (int x) /* @r{identifier, open-paren, identifier,} + @r{identifier, close-paren} */ @{ /* @r{open-brace} */ - return x * x; /* @r{keyword `return', identifier,} */ - /* @r{asterisk, identifier, semicolon} */ + return x * x; /* @r{keyword `return', identifier, asterisk, + identifier, semicolon} */ @} /* @r{close-brace} */ @end example +@end ifinfo +@ifnotinfo +@example +int /* @r{keyword `int'} */ +square (int x) /* @r{identifier, open-paren, identifier, identifier, close-paren} */ +@{ /* @r{open-brace} */ + return x * x; /* @r{keyword `return', identifier, asterisk, identifier, semicolon} */ +@} /* @r{close-brace} */ +@end example +@end ifnotinfo The syntactic groupings of C include the expression, the statement, the declaration, and the function definition. These are represented in the @@ -683,7 +693,7 @@ arrange for it to call @code{yyparse} or the parser will never run. @xref{Interface, ,Parser C-Language Interface}. Aside from the token type names and the symbols in the actions you -write, all variable and function names used in the Bison parser file +write, all symbols defined in the Bison parser file itself begin with @samp{yy} or @samp{YY}. This includes interface functions such as the lexical analyzer function @code{yylex}, the error reporting function @code{yyerror} and the parser function @code{yyparse} itself. @@ -692,6 +702,15 @@ Therefore, you should avoid using C identifiers starting with @samp{yy} or @samp{YY} in the Bison grammar file except for the ones defined in this manual. +In some cases the Bison parser file includes system headers, and in +those cases your code should respect the identifiers reserved by those +headers. On some non-@sc{gnu} hosts, @code{}, +@code{}, and @code{} are included as needed to +declare memory allocators and related types. In the same situation, +C++ parsers may include @code{} and @code{} instead. +Other system headers may be included if you define @code{YYDEBUG} +(@pxref{Debugging, ,Debugging Your Parser}). + @node Stages @section Stages in Using Bison @cindex stages in using Bison @@ -855,11 +874,12 @@ The declarations section (@pxref{Prologue, , The prologue}) contains two preprocessor directives. The @code{#define} directive defines the macro @code{YYSTYPE}, thus -specifying the C data type for semantic values of both tokens and groupings -(@pxref{Value Type, ,Data Types of Semantic Values}). The Bison parser will use whatever type -@code{YYSTYPE} is defined as; if you don't define it, @code{int} is the -default. Because we specify @code{double}, each token and each expression -has an associated value, which is a floating point number. +specifying the C data type for semantic values of both tokens and +groupings (@pxref{Value Type, ,Data Types of Semantic Values}). The +Bison parser will use whatever type @code{YYSTYPE} is defined as; if you +don't define it, @code{int} is the default. Because we specify +@code{double}, each token and each expression has an associated value, +which is a floating point number. The @code{#include} directive is used to declare the exponentiation function @code{pow}. @@ -1066,10 +1086,11 @@ token type is an identifier, that identifier is defined by Bison as a C macro whose definition is the appropriate number. In this example, therefore, @code{NUM} becomes a macro for @code{yylex} to use. -The semantic value of the token (if it has one) is stored into the global -variable @code{yylval}, which is where the Bison parser will look for it. -(The C data type of @code{yylval} is @code{YYSTYPE}, which was defined -at the beginning of the grammar; @pxref{Rpcalc Decls, ,Declarations for @code{rpcalc}}.) +The semantic value of the token (if it has one) is stored into the +global variable @code{yylval}, which is where the Bison parser will look +for it. (The C data type of @code{yylval} is @code{YYSTYPE}, which was +defined at the beginning of the grammar; @pxref{Rpcalc Decls, +,Declarations for @code{rpcalc}}.) A token type code of zero is returned if the end-of-file is encountered. (Bison recognizes any nonpositive value as indicating the end of the @@ -1202,19 +1223,19 @@ Here is how to compile and run the parser file: @example @group # @r{List files in current directory.} -% ls +$ @kbd{ls} rpcalc.tab.c rpcalc.y @end group @group # @r{Compile the Bison parser.} # @r{@samp{-lm} tells compiler to search math library for @code{pow}.} -% cc rpcalc.tab.c -lm -o rpcalc +$ @kbd{cc rpcalc.tab.c -lm -o rpcalc} @end group @group # @r{List files again.} -% ls +$ @kbd{ls} rpcalc rpcalc.tab.c rpcalc.y @end group @end example @@ -1223,19 +1244,19 @@ The file @file{rpcalc} now contains the executable code. Here is an example session using @code{rpcalc}. @example -% rpcalc -4 9 + +$ @kbd{rpcalc} +@kbd{4 9 +} 13 -3 7 + 3 4 5 *+- +@kbd{3 7 + 3 4 5 *+-} -13 -3 7 + 3 4 5 * + - n @r{Note the unary minus, @samp{n}} +@kbd{3 7 + 3 4 5 * + - n} @r{Note the unary minus, @samp{n}} 13 -5 6 / 4 n + +@kbd{5 6 / 4 n +} -3.166666667 -3 4 ^ @r{Exponentiation} +@kbd{3 4 ^} @r{Exponentiation} 81 -^D @r{End-of-file indicator} -% +@kbd{^D} @r{End-of-file indicator} +$ @end example @node Infix Calc @@ -1315,12 +1336,12 @@ Here is a sample run of @file{calc.y}: @need 500 @example -% calc -4 + 4.5 - (34/(8*3+-3)) +$ @kbd{calc} +@kbd{4 + 4.5 - (34/(8*3+-3))} 6.880952381 --56 + 2 +@kbd{-56 + 2} -54 -3 ^ 2 +@kbd{3 ^ 2} 9 @end example @@ -1372,13 +1393,11 @@ Bison programs. @cindex @code{ltcalc} @cindex calculator, location tracking -This example extends the infix notation calculator with location tracking. -This feature will be used to improve error reporting, and provide better -error messages. - -For the sake of clarity, we will switch for this example to an integer -calculator, since most of the work needed to use locations will be done -in the lexical analyser. +This example extends the infix notation calculator with location +tracking. This feature will be used to improve the error messages. For +the sake of clarity, this example is a simple integer calculator, since +most of the work needed to use locations will be done in the lexical +analyser. @menu * Decls: Ltcalc Decls. Bison and C declarations for ltcalc. @@ -1389,8 +1408,8 @@ in the lexical analyser. @node Ltcalc Decls @subsection Declarations for @code{ltcalc} -The C and Bison declarations for the location tracking calculator are the same -as the declarations for the infix notation calculator. +The C and Bison declarations for the location tracking calculator are +the same as the declarations for the infix notation calculator. @example /* Location tracking calculator. */ @@ -1411,22 +1430,24 @@ as the declarations for the infix notation calculator. %% /* Grammar follows */ @end example -In the code above, there are no declarations specific to locations. Defining -a data type for storing locations is not needed: we will use the type provided -by default (@pxref{Location Type, ,Data Types of Locations}), which is a four -member structure with the following integer fields: @code{first_line}, -@code{first_column}, @code{last_line} and @code{last_column}. +@noindent +Note there are no declarations specific to locations. Defining a data +type for storing locations is not needed: we will use the type provided +by default (@pxref{Location Type, ,Data Types of Locations}), which is a +four member structure with the following integer fields: +@code{first_line}, @code{first_column}, @code{last_line} and +@code{last_column}. @node Ltcalc Rules @subsection Grammar Rules for @code{ltcalc} -Whether you choose to handle locations or not has no effect on the syntax of -your language. Therefore, grammar rules for this example will be very close to -those of the previous example: we will only modify them to benefit from the new -informations we will have. +Whether handling locations or not has no effect on the syntax of your +language. Therefore, grammar rules for this example will be very close +to those of the previous example: we will only modify them to benefit +from the new information. -Here, we will use locations to report divisions by zero, and locate the wrong -expressions or subexpressions. +Here, we will use locations to report divisions by zero, and locate the +wrong expressions or subexpressions. @example @group @@ -1447,17 +1468,17 @@ exp : NUM @{ $$ = $1; @} | exp '-' exp @{ $$ = $1 - $3; @} | exp '*' exp @{ $$ = $1 * $3; @} @end group - | exp '/' exp @group + | exp '/' exp @{ if ($3) $$ = $1 / $3; else @{ $$ = 1; - printf("Division by zero, l%d,c%d-l%d,c%d", - @@3.first_line, @@3.first_column, - @@3.last_line, @@3.last_column); + fprintf (stderr, "%d.%d-%d.%d: division by zero", + @@3.first_line, @@3.first_column, + @@3.last_line, @@3.last_column); @} @} @end group @@ -1472,25 +1493,24 @@ This code shows how to reach locations inside of semantic actions, by using the pseudo-variables @code{@@@var{n}} for rule components, and the pseudo-variable @code{@@$} for groupings. -In this example, we never assign a value to @code{@@$}, because the -output parser can do this automatically. By default, before executing -the C code of each action, @code{@@$} is set to range from the beginning -of @code{@@1} to the end of @code{@@@var{n}}, for a rule with @var{n} -components. - -Of course, this behavior can be redefined (@pxref{Location Default -Action, , Default Action for Locations}), and for very specific rules, -@code{@@$} can be computed by hand. +We don't need to assign a value to @code{@@$}: the output parser does it +automatically. By default, before executing the C code of each action, +@code{@@$} is set to range from the beginning of @code{@@1} to the end +of @code{@@@var{n}}, for a rule with @var{n} components. This behavior +can be redefined (@pxref{Location Default Action, , Default Action for +Locations}), and for very specific rules, @code{@@$} can be computed by +hand. @node Ltcalc Lexer @subsection The @code{ltcalc} Lexical Analyzer. -Until now, we relied on Bison's defaults to enable location tracking. The next -step is to rewrite the lexical analyser, and make it able to feed the parser -with locations of tokens, as he already does for semantic values. +Until now, we relied on Bison's defaults to enable location +tracking. The next step is to rewrite the lexical analyser, and make it +able to feed the parser with the token locations, as it already does for +semantic values. -To do so, we must take into account every single character of the input text, -to avoid the computed locations of being fuzzy or wrong: +To this end, we must take into account every single character of the +input text, to avoid the computed locations of being fuzzy or wrong: @example @group @@ -1540,17 +1560,18 @@ yylex (void) @} @end example -Basically, the lexical analyzer does the same processing as before: it skips -blanks and tabs, and reads numbers or single-character tokens. In addition -to this, it updates the @code{yylloc} global variable (of type @code{YYLTYPE}), -where the location of tokens is stored. +Basically, the lexical analyzer performs the same processing as before: +it skips blanks and tabs, and reads numbers or single-character tokens. +In addition, it updates @code{yylloc}, the global variable (of type +@code{YYLTYPE}) containing the token's location. -Now, each time this function returns a token, the parser has it's number as -well as it's semantic value, and it's position in the text. The last needed -change is to initialize @code{yylloc}, for example in the controlling -function: +Now, each time this function returns a token, the parser has its number +as well as its semantic value, and its location in the text. The last +needed change is to initialize @code{yylloc}, for example in the +controlling function: @example +@group int main (void) @{ @@ -1558,11 +1579,12 @@ main (void) yylloc.first_column = yylloc.last_column = 0; return yyparse (); @} +@end group @end example -Remember that computing locations is not a matter of syntax. Every character -must be associated to a location update, whether it is in valid input, in -comments, in literal strings, and so on... +Remember that computing locations is not a matter of syntax. Every +character must be associated to a location update, whether it is in +valid input, in comments, in literal strings, and so on. @node Multi-function Calc @section Multi-Function Calculator: @code{mfcalc} @@ -1592,20 +1614,20 @@ to create named variables, store values in them, and use them later. Here is a sample session with the multi-function calculator: @example -% mfcalc -pi = 3.141592653589 +$ @kbd{mfcalc} +@kbd{pi = 3.141592653589} 3.1415926536 -sin(pi) +@kbd{sin(pi)} 0.0000000000 -alpha = beta1 = 2.3 +@kbd{alpha = beta1 = 2.3} 2.3000000000 -alpha +@kbd{alpha} 2.3000000000 -ln(alpha) +@kbd{ln(alpha)} 0.8329091229 -exp(ln(beta1)) +@kbd{exp(ln(beta1))} 2.3000000000 -% +$ @end example Note that multiple assignment and nested function calls are permitted. @@ -2375,7 +2397,8 @@ the numbers associated with @var{x} and @var{y}. In a simple program it may be sufficient to use the same data type for the semantic values of all language constructs. This was true in the -RPN and infix calculator examples (@pxref{RPN Calc, ,Reverse Polish Notation Calculator}). +RPN and infix calculator examples (@pxref{RPN Calc, ,Reverse Polish +Notation Calculator}). Bison's default is to use type @code{int} for all semantic values. To specify some other type, define @code{YYSTYPE} as a macro, like this: @@ -3068,11 +3091,11 @@ terminal symbol. All kinds of token declarations allow @findex %expect Bison normally warns if there are any conflicts in the grammar -(@pxref{Shift/Reduce, ,Shift/Reduce Conflicts}), but most real grammars have harmless shift/reduce -conflicts which are resolved in a predictable way and would be difficult to -eliminate. It is desirable to suppress the warning about these conflicts -unless the number of conflicts changes. You can do this with the -@code{%expect} declaration. +(@pxref{Shift/Reduce, ,Shift/Reduce Conflicts}), but most real grammars +have harmless shift/reduce conflicts which are resolved in a predictable +way and would be difficult to eliminate. It is desirable to suppress +the warning about these conflicts unless the number of conflicts +changes. You can do this with the @code{%expect} declaration. The declaration looks like this: @@ -3080,10 +3103,11 @@ The declaration looks like this: %expect @var{n} @end example -Here @var{n} is a decimal integer. The declaration says there should be no -warning if there are @var{n} shift/reduce conflicts and no reduce/reduce -conflicts. The usual warning is given if there are either more or fewer -conflicts, or if there are any reduce/reduce conflicts. +Here @var{n} is a decimal integer. The declaration says there should be +no warning if there are @var{n} shift/reduce conflicts and no +reduce/reduce conflicts. An error, instead of the usual warning, is +given if there are either more or fewer conflicts, or if there are any +reduce/reduce conflicts. In general, using @code{%expect} involves these steps: @@ -4907,8 +4931,14 @@ of the grammar file (@pxref{Prologue, , The Prologue}). Alternatively, use the @samp{-t} option when you run Bison (@pxref{Invocation, ,Invoking Bison}). We always define @code{YYDEBUG} so that debugging is always possible. -The trace facility uses @code{stderr}, so you must add -@w{@code{#include }} to the prologue unless it is already there. +The trace facility outputs messages with macro calls of the form +@code{YYFPRINTF (YYSTDERR, @var{format}, @var{args})} where +@var{format} and @var{args} are the usual @code{printf} format and +arguments. If you define @code{YYDEBUG} but do not define +@code{YYFPRINTF}, @code{} is automatically included and the +macros are defined to @code{fprintf} and @code{stderr}. In the same +situation, C++ parsers include @code{} instead, and use +@code{std::fprintf} and @code{std::stderr}. Once you have compiled the program with trace facilities, the way to request a trace is to store a nonzero value in the variable @code{yydebug}. @@ -4987,7 +5017,7 @@ Here @var{infile} is the grammar file name, which usually ends in @samp{.y}. The parser file's name is made by replacing the @samp{.y} with @samp{.tab.c}. Thus, the @samp{bison foo.y} filename yields @file{foo.tab.c}, and the @samp{bison hack/foo.y} filename yields -@file{hack/foo.tab.c}. It's is also possible, in case you are writting +@file{hack/foo.tab.c}. It's is also possible, in case you are writing C++ code instead of C in your grammar file, to name it @file{foo.ypp} or @file{foo.y++}. Then, the output files will take an extention like the given one as input (repectively @file{foo.tab.cpp} and @file{foo.tab.c++}).