X-Git-Url: https://git.saurik.com/bison.git/blobdiff_plain/62ab6972e8338613d09562166b4b4fa5f48693a4..dd3127cf7a8534b1689d19452e310138d8db784f:/doc/bison.info-3 diff --git a/doc/bison.info-3 b/doc/bison.info-3 index 06106f47..7640c5c0 100644 --- a/doc/bison.info-3 +++ b/doc/bison.info-3 @@ -1,5 +1,5 @@ -Ceci est le fichier Info bison.info, produit par Makeinfo version 4.0 à -partir bison.texinfo. +Ceci est le fichier Info bison.info, produit par Makeinfo version 4.0b +à partir bison.texinfo. START-INFO-DIR-ENTRY * bison: (bison). GNU Project parser generator (yacc replacement). @@ -8,7 +8,7 @@ END-INFO-DIR-ENTRY This file documents the Bison parser generator. Copyright (C) 1988, 1989, 1990, 1991, 1992, 1993, 1995, 1998, 1999, -2000 Free Software Foundation, Inc. +2000, 2001 Free Software Foundation, Inc. Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice are @@ -52,7 +52,7 @@ N to use in `$N'. set its value with an assignment to `$$', and actions later in the rule can refer to the value using `$N'. Since there is no symbol to name the action, there is no way to declare a data type for the value in -advance, so you must use the `$<...>' construct to specify a data type +advance, so you must use the `$<...>N' construct to specify a data type each time you refer to this value. There is no way to set the value of the entire rule with a mid-rule @@ -158,7 +158,143 @@ converted to an end-of-rule action in this way, and this is what Bison actually does to implement mid-rule actions.  -File: bison.info, Node: Declarations, Next: Multiple Parsers, Prev: Semantics, Up: Grammar File +File: bison.info, Node: Locations, Next: Declarations, Prev: Semantics, Up: Grammar File + +Tracking Locations +================== + + Though grammar rules and semantic actions are enough to write a fully +functional parser, it can be useful to process some additionnal +informations, especially symbol locations. + + The way locations are handled is defined by providing a data type, +and actions to take when rules are matched. + +* Menu: + +* Location Type:: Specifying a data type for locations. +* Actions and Locations:: Using locations in actions. +* Location Default Action:: Defining a general way to compute locations. + + +File: bison.info, Node: Location Type, Next: Actions and Locations, Up: Locations + +Data Type of Locations +---------------------- + + Defining a data type for locations is much simpler than for semantic +values, since all tokens and groupings always use the same type. + + The type of locations is specified by defining a macro called +`YYLTYPE'. When `YYLTYPE' is not defined, Bison uses a default +structure type with four members: + + struct + { + int first_line; + int first_column; + int last_line; + int last_column; + } + + +File: bison.info, Node: Actions and Locations, Next: Location Default Action, Prev: Location Type, Up: Locations + +Actions and Locations +--------------------- + + Actions are not only useful for defining language semantics, but +also for describing the behavior of the output parser with locations. + + The most obvious way for building locations of syntactic groupings +is very similar to the way semantic values are computed. In a given +rule, several constructs can be used to access the locations of the +elements being matched. The location of the Nth component of the right +hand side is `@N', while the location of the left hand side grouping is +`@$'. + + Here is a basic example using the default data type for locations: + + exp: ... + | exp '/' exp + { + @$.first_column = @1.first_column; + @$.first_line = @1.first_line; + @$.last_column = @3.last_column; + @$.last_line = @3.last_line; + if ($3) + $$ = $1 / $3; + else + { + $$ = 1; + printf("Division by zero, l%d,c%d-l%d,c%d", + @3.first_line, @3.first_column, + @3.last_line, @3.last_column); + } + } + + As for semantic values, there is a default action for locations that +is run each time a rule is matched. It sets the beginning of `@$' to the +beginning of the first symbol, and the end of `@$' to the end of the +last symbol. + + With this default action, the location tracking can be fully +automatic. The example above simply rewrites this way: + + exp: ... + | exp '/' exp + { + if ($3) + $$ = $1 / $3; + else + { + $$ = 1; + printf("Division by zero, l%d,c%d-l%d,c%d", + @3.first_line, @3.first_column, + @3.last_line, @3.last_column); + } + } + + +File: bison.info, Node: Location Default Action, Prev: Actions and Locations, Up: Locations + +Default Action for Locations +---------------------------- + + Actually, actions are not the best place to compute locations. Since +locations are much more general than semantic values, there is room in +the output parser to redefine the default action to take for each rule. +The `YYLLOC_DEFAULT' macro is called each time a rule is matched, +before the associated action is run. + + Most of the time, this macro is general enough to suppress location +dedicated code from semantic actions. + + The `YYLLOC_DEFAULT' macro takes three parameters. The first one is +the location of the grouping (the result of the computation). The +second one is an array holding locations of all right hand side +elements of the rule being matched. The last one is the size of the +right hand side rule. + + By default, it is defined this way: + + #define YYLLOC_DEFAULT(Current, Rhs, N) \ + Current.last_line = Rhs[N].last_line; \ + Current.last_column = Rhs[N].last_column; + + When defining `YYLLOC_DEFAULT', you should consider that: + + * All arguments are free of side-effects. However, only the first + one (the result) should be modified by `YYLLOC_DEFAULT'. + + * Before `YYLLOC_DEFAULT' is executed, the output parser sets `@$' + to `@1'. + + * For consistency with semantic actions, valid indexes for the + location array range from 1 to N. + + +File: bison.info, Node: Declarations, Next: Multiple Parsers, Prev: Locations, Up: Grammar File Bison Declarations ================== @@ -790,16 +926,18 @@ File: bison.info, Node: Token Positions, Next: Pure Calling, Prev: Token Valu Textual Positions of Tokens --------------------------- - If you are using the `@N'-feature (*note Special Features for Use in -Actions: Action Features.) in actions to keep track of the textual -locations of tokens and groupings, then you must provide this -information in `yylex'. The function `yyparse' expects to find the -textual location of a token just parsed in the global variable -`yylloc'. So `yylex' must store the proper data in that variable. The -value of `yylloc' is a structure and you need only initialize the -members that are going to be used by the actions. The four members are -called `first_line', `first_column', `last_line' and `last_column'. -Note that the use of this feature makes the parser noticeably slower. + If you are using the `@N'-feature (*note Tracking Locations: +Locations.) in actions to keep track of the textual locations of tokens +and groupings, then you must provide this information in `yylex'. The +function `yyparse' expects to find the textual location of a token just +parsed in the global variable `yylloc'. So `yylex' must store the +proper data in that variable. + + By default, the value of `yylloc' is a structure and you need only +initialize the members that are going to be used by the actions. The +four members are called `first_line', `first_column', `last_line' and +`last_column'. Note that the use of this feature makes the parser +noticeably slower. The data type of `yylloc' has the name `YYLTYPE'. @@ -1023,25 +1161,15 @@ useful in actions. errors. This is useful primarily in error rules. *Note Error Recovery::. -`@N' - Acts like a structure variable containing information on the line - numbers and column numbers of the Nth component of the current - rule. The structure has four members, like this: +`@$' + Acts like a structure variable containing information on the + textual position of the grouping made by the current rule. *Note + Tracking Locations: Locations. - struct { - int first_line, last_line; - int first_column, last_column; - }; - - Thus, to get the starting line number of the third component, you - would use `@3.first_line'. - - In order for the members of this structure to contain valid - information, you must make `yylex' supply this information about - each token. If you need only certain members, then `yylex' need - only fill in those members. - - The use of this feature makes the parser noticeably slower. +`@N' + Acts like a structure variable containing information on the + textual position of the Nth component of the current rule. *Note + Tracking Locations: Locations.  File: bison.info, Node: Algorithm, Next: Error Recovery, Prev: Interface, Up: Top @@ -1150,141 +1278,3 @@ sequence. The current look-ahead token is stored in the variable `yychar'. *Note Special Features for Use in Actions: Action Features. - -File: bison.info, Node: Shift/Reduce, Next: Precedence, Prev: Look-Ahead, Up: Algorithm - -Shift/Reduce Conflicts -====================== - - Suppose we are parsing a language which has if-then and if-then-else -statements, with a pair of rules like this: - - if_stmt: - IF expr THEN stmt - | IF expr THEN stmt ELSE stmt - ; - -Here we assume that `IF', `THEN' and `ELSE' are terminal symbols for -specific keyword tokens. - - When the `ELSE' token is read and becomes the look-ahead token, the -contents of the stack (assuming the input is valid) are just right for -reduction by the first rule. But it is also legitimate to shift the -`ELSE', because that would lead to eventual reduction by the second -rule. - - This situation, where either a shift or a reduction would be valid, -is called a "shift/reduce conflict". Bison is designed to resolve -these conflicts by choosing to shift, unless otherwise directed by -operator precedence declarations. To see the reason for this, let's -contrast it with the other alternative. - - Since the parser prefers to shift the `ELSE', the result is to attach -the else-clause to the innermost if-statement, making these two inputs -equivalent: - - if x then if y then win (); else lose; - - if x then do; if y then win (); else lose; end; - - But if the parser chose to reduce when possible rather than shift, -the result would be to attach the else-clause to the outermost -if-statement, making these two inputs equivalent: - - if x then if y then win (); else lose; - - if x then do; if y then win (); end; else lose; - - The conflict exists because the grammar as written is ambiguous: -either parsing of the simple nested if-statement is legitimate. The -established convention is that these ambiguities are resolved by -attaching the else-clause to the innermost if-statement; this is what -Bison accomplishes by choosing to shift rather than reduce. (It would -ideally be cleaner to write an unambiguous grammar, but that is very -hard to do in this case.) This particular ambiguity was first -encountered in the specifications of Algol 60 and is called the -"dangling `else'" ambiguity. - - To avoid warnings from Bison about predictable, legitimate -shift/reduce conflicts, use the `%expect N' declaration. There will be -no warning as long as the number of shift/reduce conflicts is exactly N. -*Note Suppressing Conflict Warnings: Expect Decl. - - The definition of `if_stmt' above is solely to blame for the -conflict, but the conflict does not actually appear without additional -rules. Here is a complete Bison input file that actually manifests the -conflict: - - %token IF THEN ELSE variable - %% - stmt: expr - | if_stmt - ; - - if_stmt: - IF expr THEN stmt - | IF expr THEN stmt ELSE stmt - ; - - expr: variable - ; - - -File: bison.info, Node: Precedence, Next: Contextual Precedence, Prev: Shift/Reduce, Up: Algorithm - -Operator Precedence -=================== - - Another situation where shift/reduce conflicts appear is in -arithmetic expressions. Here shifting is not always the preferred -resolution; the Bison declarations for operator precedence allow you to -specify when to shift and when to reduce. - -* Menu: - -* Why Precedence:: An example showing why precedence is needed. -* Using Precedence:: How to specify precedence in Bison grammars. -* Precedence Examples:: How these features are used in the previous example. -* How Precedence:: How they work. - - -File: bison.info, Node: Why Precedence, Next: Using Precedence, Up: Precedence - -When Precedence is Needed -------------------------- - - Consider the following ambiguous grammar fragment (ambiguous because -the input `1 - 2 * 3' can be parsed in two different ways): - - expr: expr '-' expr - | expr '*' expr - | expr '<' expr - | '(' expr ')' - ... - ; - -Suppose the parser has seen the tokens `1', `-' and `2'; should it -reduce them via the rule for the subtraction operator? It depends on -the next token. Of course, if the next token is `)', we must reduce; -shifting is invalid because no single rule can reduce the token -sequence `- 2 )' or anything starting with that. But if the next token -is `*' or `<', we have a choice: either shifting or reduction would -allow the parse to complete, but with different results. - - To decide which one Bison should do, we must consider the results. -If the next operator token OP is shifted, then it must be reduced first -in order to permit another opportunity to reduce the difference. The -result is (in effect) `1 - (2 OP 3)'. On the other hand, if the -subtraction is reduced before shifting OP, the result is -`(1 - 2) OP 3'. Clearly, then, the choice of shift or reduce should -depend on the relative precedence of the operators `-' and OP: `*' -should be shifted first, but not `<'. - - What about input such as `1 - 2 - 5'; should this be `(1 - 2) - 5' -or should it be `1 - (2 - 5)'? For most operators we prefer the -former, which is called "left association". The latter alternative, -"right association", is desirable for assignment operators. The choice -of left or right association is a matter of whether the parser chooses -to shift or reduce when the stack contains `1 - 2' and the look-ahead -token is `-': shifting makes right-associativity. -