X-Git-Url: https://git.saurik.com/bison.git/blobdiff_plain/676385e29c4aedfc05d20daf1ef20cd4ccc84856..b63473552d7014a4868eb32bf90e4f4814b9be39:/doc/bison.texinfo diff --git a/doc/bison.texinfo b/doc/bison.texinfo index e4eff35f..16b8685a 100644 --- a/doc/bison.texinfo +++ b/doc/bison.texinfo @@ -34,46 +34,33 @@ @end ifinfo @comment %**end of header -@ifinfo -@format -START-INFO-DIR-ENTRY -* bison: (bison). GNU Project parser generator (yacc replacement). -END-INFO-DIR-ENTRY -@end format -@end ifinfo +@copying -@ifinfo -This file documents the Bison parser generator. - -Copyright (C) 1988, 1989, 1990, 1991, 1992, 1993, 1995, 1998, 1999, -2000, 2001, 2002 -Free Software Foundation, Inc. - -Permission is granted to make and distribute verbatim copies of -this manual provided the copyright notice and this permission notice -are preserved on all copies. - -@ignore -Permission is granted to process this file through Tex and print the -results, provided the printed document carries copying permission -notice identical to this one except for the removal of this paragraph -(this paragraph not being relevant to the printed manual). - -@end ignore -Permission is granted to copy and distribute modified versions of this -manual under the conditions for verbatim copying, provided also that the -sections entitled ``GNU General Public License'' and ``Conditions for -Using Bison'' are included exactly as in the original, and provided that -the entire resulting derived work is distributed under the terms of a -permission notice identical to this one. - -Permission is granted to copy and distribute translations of this manual -into another language, under the above conditions for modified versions, -except that the sections entitled ``GNU General Public License'', -``Conditions for Using Bison'' and this permission notice may be -included in translations approved by the Free Software Foundation -instead of in the original English. -@end ifinfo +This manual is for GNU Bison (version @value{VERSION}, @value{UPDATED}), +the GNU parser generator. + +Copyright @copyright{} 1988, 1989, 1990, 1991, 1992, 1993, 1995, 1998, +1999, 2000, 2001, 2002 Free Software Foundation, Inc. + +@quotation +Permission is granted to copy, distribute and/or modify this document +under the terms of the GNU Free Documentation License, Version 1.1 or +any later version published by the Free Software Foundation; with no +Invariant Sections, with the Front-Cover texts being ``A GNU Manual,'' +and with the Back-Cover Texts as in (a) below. A copy of the +license is included in the section entitled ``GNU Free Documentation +License.'' + +(a) The FSF's Back-Cover Text is: ``You have freedom to copy and modify +this GNU Manual, like GNU software. Copies published by the Free +Software Foundation raise funds for GNU development.'' +@end quotation +@end copying + +@dircategory GNU programming tools +@direntry +* bison: (bison). GNU parser generator (yacc replacement). +@end direntry @ifset shorttitlepage-enabled @shorttitlepage Bison @@ -87,41 +74,13 @@ instead of in the original English. @page @vskip 0pt plus 1filll -Copyright @copyright{} 1988, 1989, 1990, 1991, 1992, 1993, 1995, 1998, -1999, 2000, 2001, 2002 -Free Software Foundation, Inc. - +@insertcopying @sp 2 Published by the Free Software Foundation @* 59 Temple Place, Suite 330 @* Boston, MA 02111-1307 USA @* Printed copies are available from the Free Software Foundation.@* ISBN 1-882114-44-2 - -Permission is granted to make and distribute verbatim copies of -this manual provided the copyright notice and this permission notice -are preserved on all copies. - -@ignore -Permission is granted to process this file through TeX and print the -results, provided the printed document carries copying permission -notice identical to this one except for the removal of this paragraph -(this paragraph not being relevant to the printed manual). - -@end ignore -Permission is granted to copy and distribute modified versions of this -manual under the conditions for verbatim copying, provided also that the -sections entitled ``GNU General Public License'' and ``Conditions for -Using Bison'' are included exactly as in the original, and provided that -the entire resulting derived work is distributed under the terms of a -permission notice identical to this one. - -Permission is granted to copy and distribute translations of this manual -into another language, under the above conditions for modified versions, -except that the sections entitled ``GNU General Public License'', -``Conditions for Using Bison'' and this permission notice may be -included in translations approved by the Free Software Foundation -instead of in the original English. @sp 2 Cover art by Etienne Suvasa. @end titlepage @@ -131,9 +90,7 @@ Cover art by Etienne Suvasa. @ifnottex @node Top @top Bison - -This manual documents version @value{VERSION} of Bison, updated -@value{UPDATED}. +@insertcopying @end ifnottex @menu @@ -348,9 +305,9 @@ This edition corresponds to version @value{VERSION} of Bison. @unnumbered Conditions for Using Bison As of Bison version 1.24, we have changed the distribution terms for -@code{yyparse} to permit using Bison's output in nonfree programs. -Formerly, Bison parsers could be used only in programs that were free -software. +@code{yyparse} to permit using Bison's output in nonfree programs when +Bison is generating C code for LALR(1) parsers. Formerly, these +parsers could be used only in programs that were free software. The other GNU programming tools, such as the GNU C compiler, have never had such a requirement. They could always be used for nonfree @@ -372,6 +329,13 @@ encourage people to make other software free. So we decided to make the practical conditions for using Bison match the practical conditions for using the other GNU tools. +This exception applies only when Bison is generating C code for a +LALR(1) parser; otherwise, the GPL terms operate as usual. You can +tell whether the exception applies to your @samp{.c} output file by +inspecting it to see whether it says ``As a special exception, when +this file is copied by Bison into a Bison output file, you may use +that output file without restriction.'' + @include gpl.texi @node Concepts @@ -438,18 +402,18 @@ Mysterious Reduce/Reduce Conflicts}, for more information on this. @cindex ambiguous grammars @cindex non-deterministic parsing Parsers for LALR(1) grammars are @dfn{deterministic}, meaning roughly that -the next grammar rule to apply at any point in the input is uniquely +the next grammar rule to apply at any point in the input is uniquely determined by the preceding input and a fixed, finite portion (called a @dfn{look-ahead}) of the remaining input. -A context-free grammar can be @dfn{ambiguous}, meaning that +A context-free grammar can be @dfn{ambiguous}, meaning that there are multiple ways to apply the grammar rules to get the some inputs. Even unambiguous grammars can be @dfn{non-deterministic}, meaning that no fixed look-ahead always suffices to determine the next grammar rule to apply. -With the proper declarations, Bison is also able to parse these more general -context-free grammars, using a technique known as GLR parsing (for -Generalized LR). Bison's GLR parsers are able to handle any context-free -grammar for which the number of possible parses of any given string -is finite. +With the proper declarations, Bison is also able to parse these more general +context-free grammars, using a technique known as GLR parsing (for +Generalized LR). Bison's GLR parsers are able to handle any context-free +grammar for which the number of possible parses of any given string +is finite. @cindex symbols (abstract) @cindex token @@ -702,7 +666,7 @@ involved, or by performing both actions, and then calling a designated user-defined function on the resulting values to produce an arbitrary merged result. -Let's consider an example, vastly simplified from C++. +Let's consider an example, vastly simplified from C++. @example %@{ @@ -718,7 +682,7 @@ Let's consider an example, vastly simplified from C++. %% -prog : +prog : | prog stmt @{ printf ("\n"); @} ; @@ -727,13 +691,13 @@ stmt : expr ';' %dprec 1 ; expr : ID @{ printf ("%s ", $$); @} - | TYPENAME '(' expr ')' + | TYPENAME '(' expr ')' @{ printf ("%s ", $1); @} | expr '+' expr @{ printf ("+ "); @} | expr '=' expr @{ printf ("= "); @} ; -decl : TYPENAME declarator ';' +decl : TYPENAME declarator ';' @{ printf ("%s ", $1); @} | TYPENAME declarator '=' expr ';' @{ printf ("%s ", $1); @} @@ -756,14 +720,14 @@ T (x) = y+z; parses as either an @code{expr} or a @code{stmt} (assuming that @samp{T} is recognized as a TYPENAME and @samp{x} as an ID). Bison detects this as a reduce/reduce conflict between the rules -@code{expr : ID} and @code{declarator : ID}, which it cannot resolve at the -time it encounters @code{x} in the example above. The two @code{%dprec} -declarations, however, give precedence to interpreting the example as a +@code{expr : ID} and @code{declarator : ID}, which it cannot resolve at the +time it encounters @code{x} in the example above. The two @code{%dprec} +declarations, however, give precedence to interpreting the example as a @code{decl}, which implies that @code{x} is a declarator. The parser therefore prints @example -"x" y z + T +"x" y z + T @end example Consider a different input string for this parser: @@ -783,7 +747,7 @@ assuming @code{x} is a @code{declarator}. The second of these parsers then vanishes when it sees @code{+}, and the parser prints @example -x T y + +x T y + @end example Suppose that instead of resolving the ambiguity, you wanted to see all @@ -826,7 +790,7 @@ With these declarations, the resulting parser will parse the first example as both an @code{expr} and a @code{decl}, and print @example -"x" y z + T x T y z + = +"x" y z + T x T y z + = @end example @@ -3595,10 +3559,10 @@ Request a pure (reentrant) parser program (@pxref{Pure Decl, ,A Pure @item %token-table Generate an array of token names in the parser file. The name of the array is @code{yytname}; @code{yytname[@var{i}]} is the name of the -token whose internal Bison token code number is @var{i}. The first three -elements of @code{yytname} are always @code{"$"}, @code{"error"}, and -@code{"$illegal"}; after these come the symbols defined in the grammar -file. +token whose internal Bison token code number is @var{i}. The first +three elements of @code{yytname} are always @code{"$end"}, +@code{"error"}, and @code{"$undefined"}; after these come the symbols +defined in the grammar file. For single-character literal tokens and literal string tokens, the name in the table includes the single-quote or double-quote characters: for @@ -4837,24 +4801,24 @@ return_spec: ; @end example -@node Generalized LR Parsing +@node Generalized LR Parsing @section Generalized LR (GLR) Parsing @cindex GLR parsing @cindex generalized LR (GLR) parsing @cindex ambiguous grammars @cindex non-deterministic parsing -Bison produces @emph{deterministic} parsers that choose uniquely -when to reduce and which reduction to apply +Bison produces @emph{deterministic} parsers that choose uniquely +when to reduce and which reduction to apply based on a summary of the preceding input and on one extra token of lookahead. As a result, normal Bison handles a proper subset of the family of context-free languages. -Ambiguous grammars, since they have strings with more than one possible +Ambiguous grammars, since they have strings with more than one possible sequence of reductions cannot have deterministic parsers in this sense. The same is true of languages that require more than one symbol of lookahead, since the parser lacks the information necessary to make a decision at the point it must be made in a shift-reduce parser. -Finally, as previously mentioned (@pxref{Mystery Conflicts}), +Finally, as previously mentioned (@pxref{Mystery Conflicts}), there are languages where Bison's particular choice of how to summarize the input seen so far loses necessary information. @@ -4863,23 +4827,23 @@ Bison generates a parser that uses a different algorithm, called Generalized LR (or GLR). A Bison GLR parser uses the same basic algorithm for parsing as an ordinary Bison parser, but behaves differently in cases where there is a shift-reduce conflict that has not -been resolved by precedence rules (@pxref{Precedence}) or a +been resolved by precedence rules (@pxref{Precedence}) or a reduce-reduce conflict. When a GLR parser encounters such a situation, it -effectively @emph{splits} into a several parsers, one for each possible +effectively @emph{splits} into a several parsers, one for each possible shift or reduction. These parsers then proceed as usual, consuming tokens in lock-step. Some of the stacks may encounter other conflicts -and split further, with the result that instead of a sequence of states, -a Bison GLR parsing stack is what is in effect a tree of states. +and split further, with the result that instead of a sequence of states, +a Bison GLR parsing stack is what is in effect a tree of states. In effect, each stack represents a guess as to what the proper parse is. Additional input may indicate that a guess was wrong, in which case the appropriate stack silently disappears. Otherwise, the semantics -actions generated in each stack are saved, rather than being executed +actions generated in each stack are saved, rather than being executed immediately. When a stack disappears, its saved semantic actions never -get executed. When a reduction causes two stacks to become equivalent, +get executed. When a reduction causes two stacks to become equivalent, their sets of semantic actions are both saved with the state that results from the reduction. We say that two stacks are equivalent -when they both represent the same sequence of states, +when they both represent the same sequence of states, and each pair of corresponding states represents a grammar symbol that produces the same segment of the input token stream. @@ -4891,16 +4855,16 @@ At this transition, some of the states on the stack will have semantic values that are sets (actually multisets) of possible actions. The parser tries to pick one of the actions by first finding one whose rule has the highest dynamic precedence, as set by the @samp{%dprec} -declaration. Otherwise, if the alternative actions are not ordered by +declaration. Otherwise, if the alternative actions are not ordered by precedence, but there the same merging function is declared for both -rules by the @samp{%merge} declaration, +rules by the @samp{%merge} declaration, Bison resolves and evaluates both and then calls the merge function on the result. Otherwise, it reports an ambiguity. It is possible to use a data structure for the GLR parsing tree that permits the processing of any LALR(1) grammar in linear time (in the size of the input), any unambiguous (not necessarily LALR(1)) grammar in -quadratic worst-case time, and any general (possibly ambiguous) +quadratic worst-case time, and any general (possibly ambiguous) context-free grammar in cubic worst-case time. However, Bison currently uses a simpler data structure that requires time proportional to the length of the input times the maximum number of stacks required for any @@ -5319,12 +5283,19 @@ useless: STR; %% @end example -@command{bison} reports that @samp{calc.y contains 1 useless nonterminal -and 1 useless rule} and that @samp{calc.y contains 7 shift/reduce -conflicts}. When given @option{--report=state}, in addition to -@file{calc.tab.c}, it creates a file @file{calc.output} with contents -detailed below. The order of the output and the exact presentation -might vary, but the interpretation is the same. +@command{bison} reports: + +@example +calc.y: warning: 1 useless nonterminal and 1 useless rule +calc.y:11.1-7: warning: useless nonterminal: useless +calc.y:11.8-12: warning: useless rule: useless: STR +calc.y contains 7 shift/reduce conflicts. +@end example + +When given @option{--report=state}, in addition to @file{calc.tab.c}, it +creates a file @file{calc.output} with contents detailed below. The +order of the output and the exact presentation might vary, but the +interpretation is the same. The first section includes details on conflicts that were solved thanks to precedence and/or associativity: @@ -5377,7 +5348,7 @@ The next section reproduces the exact grammar that Bison used: Grammar Number, Line, Rule - 0 5 $axiom -> exp $ + 0 5 $accept -> exp $end 1 5 exp -> exp '+' exp 2 6 exp -> exp '-' exp 3 7 exp -> exp '*' exp @@ -5391,7 +5362,7 @@ and reports the uses of the symbols: @example Terminals, with rules where they appear -$ (0) 0 +$end (0) 0 '*' (42) 3 '+' (43) 1 '-' (45) 2 @@ -5401,7 +5372,7 @@ NUM (258) 5 Nonterminals, with rules where they appear -$axiom (8) +$accept (8) on left: 0 exp (9) on left: 1 2 3 4 5, on right: 0 1 2 3 4 @@ -5419,7 +5390,7 @@ that the input cursor. @example state 0 - $axiom -> . exp $ (rule 0) + $accept -> . exp $ (rule 0) NUM shift, and go to state 1 @@ -5450,7 +5421,7 @@ be derived: @example state 0 - $axiom -> . exp $ (rule 0) + $accept -> . exp $ (rule 0) exp -> . exp '+' exp (rule 1) exp -> . exp '-' exp (rule 2) exp -> . exp '*' exp (rule 3) @@ -5482,7 +5453,7 @@ jump to state 2 (@samp{exp: go to state 2}). @example state 2 - $axiom -> exp . $ (rule 0) + $accept -> exp . $ (rule 0) exp -> exp . '+' exp (rule 1) exp -> exp . '-' exp (rule 2) exp -> exp . '*' exp (rule 3) @@ -5509,7 +5480,7 @@ state}: @example state 3 - $axiom -> exp $ . (rule 0) + $accept -> exp $ . (rule 0) $default accept @end example @@ -6049,7 +6020,7 @@ would instead be named @file{foo_tab.c}. @table @code @item @@$ In an action, the location of the left-hand side of the rule. - @xref{Locations, , Locations Overview}. +@xref{Locations, , Locations Overview}. @item @@@var{n} In an action, the location of the @var{n}-th symbol of the right-hand @@ -6063,6 +6034,20 @@ In an action, the semantic value of the left-hand side of the rule. In an action, the semantic value of the @var{n}-th symbol of the right-hand side of the rule. @xref{Actions}. +@item $accept +The predefined nonterminal whose only rule is @samp{$accept: @var{start} +$end}, where @var{start} is the start symbol. @xref{Start Decl, , The +Start-Symbol}. It cannot be used in the grammar. + +@item $end +The predefined token marking the end of the token stream. It cannot be +used in the grammar. + +@item $undefined +The predefined token onto which all undefined values returned by +@code{yylex} are mapped. It cannot be used in the grammar, rather, use +@code{error}. + @item error A token name reserved for error recovery. This token may be used in grammar rules so as to allow the Bison parser to recognize an error in @@ -6201,7 +6186,7 @@ Equip the parser for debugging. @xref{Decl Summary}. Bison declaration to create a header file meant for the scanner. @xref{Decl Summary}. -@item %dprec +@item %dprec Bison declaration to assign a precedence to a rule that is used at parse time to resolve reduce/reduce conflicts. @xref{GLR Parsers}. @@ -6226,7 +6211,7 @@ Bison declaration to assign left associativity to token(s). @item %merge Bison declaration to assign a merging function to a rule. If there is a -reduce/reduce conflict with a rule having the same merging function, the +reduce/reduce conflict with a rule having the same merging function, the function is applied to the two semantic values to get a single result. @xref{GLR Parsers}. @@ -6344,7 +6329,7 @@ rules. @xref{Algorithm, ,The Bison Parser Algorithm }. @item Generalized LR (GLR) A parsing algorithm that can handle all context-free grammars, including those -that are not LALR(1). It resolves situations that Bison's usual LALR(1) +that are not LALR(1). It resolves situations that Bison's usual LALR(1) algorithm cannot by effectively splitting off multiple parsers, trying all possible parsers, and discarding those that fail in the light of additional right context. @xref{Generalized LR Parsing, ,Generalized LR Parsing}.