]>
Commit | Line | Data |
---|---|---|
1e24cc5b | 1 | This is bison.info, produced by makeinfo version 4.0 from bison.texinfo. |
705db0b5 AD |
2 | |
3 | START-INFO-DIR-ENTRY | |
4 | * bison: (bison). GNU Project parser generator (yacc replacement). | |
5 | END-INFO-DIR-ENTRY | |
6 | ||
7 | This file documents the Bison parser generator. | |
8 | ||
9 | Copyright (C) 1988, 1989, 1990, 1991, 1992, 1993, 1995, 1998, 1999, | |
10 | 2000 Free Software Foundation, Inc. | |
11 | ||
12 | Permission is granted to make and distribute verbatim copies of this | |
13 | manual provided the copyright notice and this permission notice are | |
14 | preserved on all copies. | |
15 | ||
16 | Permission is granted to copy and distribute modified versions of | |
17 | this manual under the conditions for verbatim copying, provided also | |
18 | that the sections entitled "GNU General Public License" and "Conditions | |
19 | for Using Bison" are included exactly as in the original, and provided | |
20 | that the entire resulting derived work is distributed under the terms | |
21 | of a permission notice identical to this one. | |
22 | ||
23 | Permission is granted to copy and distribute translations of this | |
24 | manual into another language, under the above conditions for modified | |
25 | versions, except that the sections entitled "GNU General Public | |
26 | License", "Conditions for Using Bison" and this permission notice may be | |
27 | included in translations approved by the Free Software Foundation | |
28 | instead of in the original English. | |
29 | ||
6deb4447 AD |
30 | \1f |
31 | File: bison.info, Node: Using Precedence, Next: Precedence Examples, Prev: Why Precedence, Up: Precedence | |
32 | ||
33 | Specifying Operator Precedence | |
34 | ------------------------------ | |
35 | ||
36 | Bison allows you to specify these choices with the operator | |
37 | precedence declarations `%left' and `%right'. Each such declaration | |
38 | contains a list of tokens, which are operators whose precedence and | |
39 | associativity is being declared. The `%left' declaration makes all | |
40 | those operators left-associative and the `%right' declaration makes | |
41 | them right-associative. A third alternative is `%nonassoc', which | |
42 | declares that it is a syntax error to find the same operator twice "in a | |
43 | row". | |
44 | ||
45 | The relative precedence of different operators is controlled by the | |
46 | order in which they are declared. The first `%left' or `%right' | |
47 | declaration in the file declares the operators whose precedence is | |
48 | lowest, the next such declaration declares the operators whose | |
49 | precedence is a little higher, and so on. | |
50 | ||
51 | \1f | |
52 | File: bison.info, Node: Precedence Examples, Next: How Precedence, Prev: Using Precedence, Up: Precedence | |
53 | ||
54 | Precedence Examples | |
55 | ------------------- | |
56 | ||
57 | In our example, we would want the following declarations: | |
58 | ||
59 | %left '<' | |
60 | %left '-' | |
61 | %left '*' | |
62 | ||
63 | In a more complete example, which supports other operators as well, | |
64 | we would declare them in groups of equal precedence. For example, | |
65 | `'+'' is declared with `'-'': | |
66 | ||
67 | %left '<' '>' '=' NE LE GE | |
68 | %left '+' '-' | |
69 | %left '*' '/' | |
70 | ||
71 | (Here `NE' and so on stand for the operators for "not equal" and so on. | |
72 | We assume that these tokens are more than one character long and | |
73 | therefore are represented by names, not character literals.) | |
74 | ||
705db0b5 AD |
75 | \1f |
76 | File: bison.info, Node: How Precedence, Prev: Precedence Examples, Up: Precedence | |
77 | ||
78 | How Precedence Works | |
79 | -------------------- | |
80 | ||
81 | The first effect of the precedence declarations is to assign | |
82 | precedence levels to the terminal symbols declared. The second effect | |
83 | is to assign precedence levels to certain rules: each rule gets its | |
84 | precedence from the last terminal symbol mentioned in the components. | |
85 | (You can also specify explicitly the precedence of a rule. *Note | |
86 | Context-Dependent Precedence: Contextual Precedence.) | |
87 | ||
88 | Finally, the resolution of conflicts works by comparing the | |
89 | precedence of the rule being considered with that of the look-ahead | |
90 | token. If the token's precedence is higher, the choice is to shift. | |
91 | If the rule's precedence is higher, the choice is to reduce. If they | |
92 | have equal precedence, the choice is made based on the associativity of | |
93 | that precedence level. The verbose output file made by `-v' (*note | |
94 | Invoking Bison: Invocation.) says how each conflict was resolved. | |
95 | ||
96 | Not all rules and not all tokens have precedence. If either the | |
97 | rule or the look-ahead token has no precedence, then the default is to | |
98 | shift. | |
99 | ||
100 | \1f | |
101 | File: bison.info, Node: Contextual Precedence, Next: Parser States, Prev: Precedence, Up: Algorithm | |
102 | ||
103 | Context-Dependent Precedence | |
104 | ============================ | |
105 | ||
106 | Often the precedence of an operator depends on the context. This | |
107 | sounds outlandish at first, but it is really very common. For example, | |
108 | a minus sign typically has a very high precedence as a unary operator, | |
109 | and a somewhat lower precedence (lower than multiplication) as a binary | |
110 | operator. | |
111 | ||
112 | The Bison precedence declarations, `%left', `%right' and | |
113 | `%nonassoc', can only be used once for a given token; so a token has | |
114 | only one precedence declared in this way. For context-dependent | |
115 | precedence, you need to use an additional mechanism: the `%prec' | |
116 | modifier for rules. | |
117 | ||
118 | The `%prec' modifier declares the precedence of a particular rule by | |
119 | specifying a terminal symbol whose precedence should be used for that | |
120 | rule. It's not necessary for that symbol to appear otherwise in the | |
121 | rule. The modifier's syntax is: | |
122 | ||
123 | %prec TERMINAL-SYMBOL | |
124 | ||
125 | and it is written after the components of the rule. Its effect is to | |
126 | assign the rule the precedence of TERMINAL-SYMBOL, overriding the | |
127 | precedence that would be deduced for it in the ordinary way. The | |
128 | altered rule precedence then affects how conflicts involving that rule | |
129 | are resolved (*note Operator Precedence: Precedence.). | |
130 | ||
131 | Here is how `%prec' solves the problem of unary minus. First, | |
132 | declare a precedence for a fictitious terminal symbol named `UMINUS'. | |
133 | There are no tokens of this type, but the symbol serves to stand for its | |
134 | precedence: | |
135 | ||
136 | ... | |
137 | %left '+' '-' | |
138 | %left '*' | |
139 | %left UMINUS | |
140 | ||
141 | Now the precedence of `UMINUS' can be used in specific rules: | |
142 | ||
143 | exp: ... | |
144 | | exp '-' exp | |
145 | ... | |
146 | | '-' exp %prec UMINUS | |
147 | ||
148 | \1f | |
149 | File: bison.info, Node: Parser States, Next: Reduce/Reduce, Prev: Contextual Precedence, Up: Algorithm | |
150 | ||
151 | Parser States | |
152 | ============= | |
153 | ||
154 | The function `yyparse' is implemented using a finite-state machine. | |
155 | The values pushed on the parser stack are not simply token type codes; | |
156 | they represent the entire sequence of terminal and nonterminal symbols | |
157 | at or near the top of the stack. The current state collects all the | |
158 | information about previous input which is relevant to deciding what to | |
159 | do next. | |
160 | ||
161 | Each time a look-ahead token is read, the current parser state | |
162 | together with the type of look-ahead token are looked up in a table. | |
163 | This table entry can say, "Shift the look-ahead token." In this case, | |
164 | it also specifies the new parser state, which is pushed onto the top of | |
165 | the parser stack. Or it can say, "Reduce using rule number N." This | |
166 | means that a certain number of tokens or groupings are taken off the | |
167 | top of the stack, and replaced by one grouping. In other words, that | |
168 | number of states are popped from the stack, and one new state is pushed. | |
169 | ||
170 | There is one other alternative: the table can say that the | |
171 | look-ahead token is erroneous in the current state. This causes error | |
172 | processing to begin (*note Error Recovery::). | |
173 | ||
174 | \1f | |
175 | File: bison.info, Node: Reduce/Reduce, Next: Mystery Conflicts, Prev: Parser States, Up: Algorithm | |
176 | ||
177 | Reduce/Reduce Conflicts | |
178 | ======================= | |
179 | ||
180 | A reduce/reduce conflict occurs if there are two or more rules that | |
181 | apply to the same sequence of input. This usually indicates a serious | |
182 | error in the grammar. | |
183 | ||
184 | For example, here is an erroneous attempt to define a sequence of | |
185 | zero or more `word' groupings. | |
186 | ||
187 | sequence: /* empty */ | |
188 | { printf ("empty sequence\n"); } | |
189 | | maybeword | |
190 | | sequence word | |
191 | { printf ("added word %s\n", $2); } | |
192 | ; | |
193 | ||
194 | maybeword: /* empty */ | |
195 | { printf ("empty maybeword\n"); } | |
196 | | word | |
197 | { printf ("single word %s\n", $1); } | |
198 | ; | |
199 | ||
200 | The error is an ambiguity: there is more than one way to parse a single | |
201 | `word' into a `sequence'. It could be reduced to a `maybeword' and | |
202 | then into a `sequence' via the second rule. Alternatively, | |
203 | nothing-at-all could be reduced into a `sequence' via the first rule, | |
204 | and this could be combined with the `word' using the third rule for | |
205 | `sequence'. | |
206 | ||
207 | There is also more than one way to reduce nothing-at-all into a | |
208 | `sequence'. This can be done directly via the first rule, or | |
209 | indirectly via `maybeword' and then the second rule. | |
210 | ||
211 | You might think that this is a distinction without a difference, | |
212 | because it does not change whether any particular input is valid or | |
213 | not. But it does affect which actions are run. One parsing order runs | |
214 | the second rule's action; the other runs the first rule's action and | |
215 | the third rule's action. In this example, the output of the program | |
216 | changes. | |
217 | ||
218 | Bison resolves a reduce/reduce conflict by choosing to use the rule | |
219 | that appears first in the grammar, but it is very risky to rely on | |
220 | this. Every reduce/reduce conflict must be studied and usually | |
221 | eliminated. Here is the proper way to define `sequence': | |
222 | ||
223 | sequence: /* empty */ | |
224 | { printf ("empty sequence\n"); } | |
225 | | sequence word | |
226 | { printf ("added word %s\n", $2); } | |
227 | ; | |
228 | ||
229 | Here is another common error that yields a reduce/reduce conflict: | |
230 | ||
231 | sequence: /* empty */ | |
232 | | sequence words | |
233 | | sequence redirects | |
234 | ; | |
235 | ||
236 | words: /* empty */ | |
237 | | words word | |
238 | ; | |
239 | ||
240 | redirects:/* empty */ | |
241 | | redirects redirect | |
242 | ; | |
243 | ||
244 | The intention here is to define a sequence which can contain either | |
245 | `word' or `redirect' groupings. The individual definitions of | |
246 | `sequence', `words' and `redirects' are error-free, but the three | |
247 | together make a subtle ambiguity: even an empty input can be parsed in | |
248 | infinitely many ways! | |
249 | ||
250 | Consider: nothing-at-all could be a `words'. Or it could be two | |
251 | `words' in a row, or three, or any number. It could equally well be a | |
252 | `redirects', or two, or any number. Or it could be a `words' followed | |
253 | by three `redirects' and another `words'. And so on. | |
254 | ||
255 | Here are two ways to correct these rules. First, to make it a | |
256 | single level of sequence: | |
257 | ||
258 | sequence: /* empty */ | |
259 | | sequence word | |
260 | | sequence redirect | |
261 | ; | |
262 | ||
263 | Second, to prevent either a `words' or a `redirects' from being | |
264 | empty: | |
265 | ||
266 | sequence: /* empty */ | |
267 | | sequence words | |
268 | | sequence redirects | |
269 | ; | |
270 | ||
271 | words: word | |
272 | | words word | |
273 | ; | |
274 | ||
275 | redirects:redirect | |
276 | | redirects redirect | |
277 | ; | |
278 | ||
279 | \1f | |
280 | File: bison.info, Node: Mystery Conflicts, Next: Stack Overflow, Prev: Reduce/Reduce, Up: Algorithm | |
281 | ||
282 | Mysterious Reduce/Reduce Conflicts | |
283 | ================================== | |
284 | ||
285 | Sometimes reduce/reduce conflicts can occur that don't look | |
286 | warranted. Here is an example: | |
287 | ||
288 | %token ID | |
289 | ||
290 | %% | |
291 | def: param_spec return_spec ',' | |
292 | ; | |
293 | param_spec: | |
294 | type | |
295 | | name_list ':' type | |
296 | ; | |
297 | return_spec: | |
298 | type | |
299 | | name ':' type | |
300 | ; | |
301 | type: ID | |
302 | ; | |
303 | name: ID | |
304 | ; | |
305 | name_list: | |
306 | name | |
307 | | name ',' name_list | |
308 | ; | |
309 | ||
310 | It would seem that this grammar can be parsed with only a single | |
311 | token of look-ahead: when a `param_spec' is being read, an `ID' is a | |
312 | `name' if a comma or colon follows, or a `type' if another `ID' | |
313 | follows. In other words, this grammar is LR(1). | |
314 | ||
315 | However, Bison, like most parser generators, cannot actually handle | |
316 | all LR(1) grammars. In this grammar, two contexts, that after an `ID' | |
317 | at the beginning of a `param_spec' and likewise at the beginning of a | |
318 | `return_spec', are similar enough that Bison assumes they are the same. | |
319 | They appear similar because the same set of rules would be active--the | |
320 | rule for reducing to a `name' and that for reducing to a `type'. Bison | |
321 | is unable to determine at that stage of processing that the rules would | |
322 | require different look-ahead tokens in the two contexts, so it makes a | |
323 | single parser state for them both. Combining the two contexts causes a | |
324 | conflict later. In parser terminology, this occurrence means that the | |
325 | grammar is not LALR(1). | |
326 | ||
327 | In general, it is better to fix deficiencies than to document them. | |
328 | But this particular deficiency is intrinsically hard to fix; parser | |
329 | generators that can handle LR(1) grammars are hard to write and tend to | |
330 | produce parsers that are very large. In practice, Bison is more useful | |
331 | as it is now. | |
332 | ||
333 | When the problem arises, you can often fix it by identifying the two | |
334 | parser states that are being confused, and adding something to make them | |
335 | look distinct. In the above example, adding one rule to `return_spec' | |
336 | as follows makes the problem go away: | |
337 | ||
338 | %token BOGUS | |
339 | ... | |
340 | %% | |
341 | ... | |
342 | return_spec: | |
343 | type | |
344 | | name ':' type | |
345 | /* This rule is never used. */ | |
346 | | ID BOGUS | |
347 | ; | |
348 | ||
349 | This corrects the problem because it introduces the possibility of an | |
350 | additional active rule in the context after the `ID' at the beginning of | |
351 | `return_spec'. This rule is not active in the corresponding context in | |
352 | a `param_spec', so the two contexts receive distinct parser states. As | |
353 | long as the token `BOGUS' is never generated by `yylex', the added rule | |
354 | cannot alter the way actual input is parsed. | |
355 | ||
356 | In this particular example, there is another way to solve the | |
357 | problem: rewrite the rule for `return_spec' to use `ID' directly | |
358 | instead of via `name'. This also causes the two confusing contexts to | |
359 | have different sets of active rules, because the one for `return_spec' | |
360 | activates the altered rule for `return_spec' rather than the one for | |
361 | `name'. | |
362 | ||
363 | param_spec: | |
364 | type | |
365 | | name_list ':' type | |
366 | ; | |
367 | return_spec: | |
368 | type | |
369 | | ID ':' type | |
370 | ; | |
371 | ||
372 | \1f | |
373 | File: bison.info, Node: Stack Overflow, Prev: Mystery Conflicts, Up: Algorithm | |
374 | ||
375 | Stack Overflow, and How to Avoid It | |
376 | =================================== | |
377 | ||
378 | The Bison parser stack can overflow if too many tokens are shifted | |
379 | and not reduced. When this happens, the parser function `yyparse' | |
380 | returns a nonzero value, pausing only to call `yyerror' to report the | |
381 | overflow. | |
382 | ||
383 | By defining the macro `YYMAXDEPTH', you can control how deep the | |
384 | parser stack can become before a stack overflow occurs. Define the | |
385 | macro with a value that is an integer. This value is the maximum number | |
386 | of tokens that can be shifted (and not reduced) before overflow. It | |
387 | must be a constant expression whose value is known at compile time. | |
388 | ||
389 | The stack space allowed is not necessarily allocated. If you | |
390 | specify a large value for `YYMAXDEPTH', the parser actually allocates a | |
391 | small stack at first, and then makes it bigger by stages as needed. | |
392 | This increasing allocation happens automatically and silently. | |
393 | Therefore, you do not need to make `YYMAXDEPTH' painfully small merely | |
394 | to save space for ordinary inputs that do not need much stack. | |
395 | ||
396 | The default value of `YYMAXDEPTH', if you do not define it, is 10000. | |
397 | ||
398 | You can control how much stack is allocated initially by defining the | |
399 | macro `YYINITDEPTH'. This value too must be a compile-time constant | |
400 | integer. The default is 200. | |
401 | ||
402 | \1f | |
403 | File: bison.info, Node: Error Recovery, Next: Context Dependency, Prev: Algorithm, Up: Top | |
404 | ||
405 | Error Recovery | |
406 | ************** | |
407 | ||
408 | It is not usually acceptable to have a program terminate on a parse | |
409 | error. For example, a compiler should recover sufficiently to parse the | |
410 | rest of the input file and check it for errors; a calculator should | |
411 | accept another expression. | |
412 | ||
413 | In a simple interactive command parser where each input is one line, | |
414 | it may be sufficient to allow `yyparse' to return 1 on error and have | |
415 | the caller ignore the rest of the input line when that happens (and | |
416 | then call `yyparse' again). But this is inadequate for a compiler, | |
417 | because it forgets all the syntactic context leading up to the error. | |
418 | A syntax error deep within a function in the compiler input should not | |
419 | cause the compiler to treat the following line like the beginning of a | |
420 | source file. | |
421 | ||
422 | You can define how to recover from a syntax error by writing rules to | |
423 | recognize the special token `error'. This is a terminal symbol that is | |
424 | always defined (you need not declare it) and reserved for error | |
425 | handling. The Bison parser generates an `error' token whenever a | |
426 | syntax error happens; if you have provided a rule to recognize this | |
427 | token in the current context, the parse can continue. | |
428 | ||
429 | For example: | |
430 | ||
431 | stmnts: /* empty string */ | |
432 | | stmnts '\n' | |
433 | | stmnts exp '\n' | |
434 | | stmnts error '\n' | |
435 | ||
436 | The fourth rule in this example says that an error followed by a | |
437 | newline makes a valid addition to any `stmnts'. | |
438 | ||
439 | What happens if a syntax error occurs in the middle of an `exp'? The | |
440 | error recovery rule, interpreted strictly, applies to the precise | |
441 | sequence of a `stmnts', an `error' and a newline. If an error occurs in | |
442 | the middle of an `exp', there will probably be some additional tokens | |
443 | and subexpressions on the stack after the last `stmnts', and there will | |
444 | be tokens to read before the next newline. So the rule is not | |
445 | applicable in the ordinary way. | |
446 | ||
447 | But Bison can force the situation to fit the rule, by discarding | |
448 | part of the semantic context and part of the input. First it discards | |
449 | states and objects from the stack until it gets back to a state in | |
450 | which the `error' token is acceptable. (This means that the | |
451 | subexpressions already parsed are discarded, back to the last complete | |
452 | `stmnts'.) At this point the `error' token can be shifted. Then, if | |
453 | the old look-ahead token is not acceptable to be shifted next, the | |
454 | parser reads tokens and discards them until it finds a token which is | |
455 | acceptable. In this example, Bison reads and discards input until the | |
456 | next newline so that the fourth rule can apply. | |
457 | ||
458 | The choice of error rules in the grammar is a choice of strategies | |
459 | for error recovery. A simple and useful strategy is simply to skip the | |
460 | rest of the current input line or current statement if an error is | |
461 | detected: | |
462 | ||
463 | stmnt: error ';' /* on error, skip until ';' is read */ | |
464 | ||
465 | It is also useful to recover to the matching close-delimiter of an | |
466 | opening-delimiter that has already been parsed. Otherwise the | |
467 | close-delimiter will probably appear to be unmatched, and generate | |
468 | another, spurious error message: | |
469 | ||
470 | primary: '(' expr ')' | |
471 | | '(' error ')' | |
472 | ... | |
473 | ; | |
474 | ||
475 | Error recovery strategies are necessarily guesses. When they guess | |
476 | wrong, one syntax error often leads to another. In the above example, | |
477 | the error recovery rule guesses that an error is due to bad input | |
478 | within one `stmnt'. Suppose that instead a spurious semicolon is | |
479 | inserted in the middle of a valid `stmnt'. After the error recovery | |
480 | rule recovers from the first error, another syntax error will be found | |
481 | straightaway, since the text following the spurious semicolon is also | |
482 | an invalid `stmnt'. | |
483 | ||
484 | To prevent an outpouring of error messages, the parser will output | |
485 | no error message for another syntax error that happens shortly after | |
486 | the first; only after three consecutive input tokens have been | |
487 | successfully shifted will error messages resume. | |
488 | ||
489 | Note that rules which accept the `error' token may have actions, just | |
490 | as any other rules can. | |
491 | ||
492 | You can make error messages resume immediately by using the macro | |
493 | `yyerrok' in an action. If you do this in the error rule's action, no | |
494 | error messages will be suppressed. This macro requires no arguments; | |
495 | `yyerrok;' is a valid C statement. | |
496 | ||
497 | The previous look-ahead token is reanalyzed immediately after an | |
498 | error. If this is unacceptable, then the macro `yyclearin' may be used | |
499 | to clear this token. Write the statement `yyclearin;' in the error | |
500 | rule's action. | |
501 | ||
502 | For example, suppose that on a parse error, an error handling | |
503 | routine is called that advances the input stream to some point where | |
504 | parsing should once again commence. The next symbol returned by the | |
505 | lexical scanner is probably correct. The previous look-ahead token | |
506 | ought to be discarded with `yyclearin;'. | |
507 | ||
508 | The macro `YYRECOVERING' stands for an expression that has the value | |
509 | 1 when the parser is recovering from a syntax error, and 0 the rest of | |
510 | the time. A value of 1 indicates that error messages are currently | |
511 | suppressed for new syntax errors. | |
512 | ||
513 | \1f | |
514 | File: bison.info, Node: Context Dependency, Next: Debugging, Prev: Error Recovery, Up: Top | |
515 | ||
516 | Handling Context Dependencies | |
517 | ***************************** | |
518 | ||
519 | The Bison paradigm is to parse tokens first, then group them into | |
520 | larger syntactic units. In many languages, the meaning of a token is | |
521 | affected by its context. Although this violates the Bison paradigm, | |
522 | certain techniques (known as "kludges") may enable you to write Bison | |
523 | parsers for such languages. | |
524 | ||
525 | * Menu: | |
526 | ||
527 | * Semantic Tokens:: Token parsing can depend on the semantic context. | |
528 | * Lexical Tie-ins:: Token parsing can depend on the syntactic context. | |
529 | * Tie-in Recovery:: Lexical tie-ins have implications for how | |
530 | error recovery rules must be written. | |
531 | ||
532 | (Actually, "kludge" means any technique that gets its job done but is | |
533 | neither clean nor robust.) | |
534 | ||
535 | \1f | |
536 | File: bison.info, Node: Semantic Tokens, Next: Lexical Tie-ins, Up: Context Dependency | |
537 | ||
538 | Semantic Info in Token Types | |
539 | ============================ | |
540 | ||
541 | The C language has a context dependency: the way an identifier is | |
542 | used depends on what its current meaning is. For example, consider | |
543 | this: | |
544 | ||
545 | foo (x); | |
546 | ||
547 | This looks like a function call statement, but if `foo' is a typedef | |
548 | name, then this is actually a declaration of `x'. How can a Bison | |
549 | parser for C decide how to parse this input? | |
550 | ||
551 | The method used in GNU C is to have two different token types, | |
552 | `IDENTIFIER' and `TYPENAME'. When `yylex' finds an identifier, it | |
553 | looks up the current declaration of the identifier in order to decide | |
554 | which token type to return: `TYPENAME' if the identifier is declared as | |
555 | a typedef, `IDENTIFIER' otherwise. | |
556 | ||
557 | The grammar rules can then express the context dependency by the | |
558 | choice of token type to recognize. `IDENTIFIER' is accepted as an | |
559 | expression, but `TYPENAME' is not. `TYPENAME' can start a declaration, | |
560 | but `IDENTIFIER' cannot. In contexts where the meaning of the | |
561 | identifier is _not_ significant, such as in declarations that can | |
562 | shadow a typedef name, either `TYPENAME' or `IDENTIFIER' is | |
563 | accepted--there is one rule for each of the two token types. | |
564 | ||
565 | This technique is simple to use if the decision of which kinds of | |
566 | identifiers to allow is made at a place close to where the identifier is | |
567 | parsed. But in C this is not always so: C allows a declaration to | |
568 | redeclare a typedef name provided an explicit type has been specified | |
569 | earlier: | |
570 | ||
571 | typedef int foo, bar, lose; | |
572 | static foo (bar); /* redeclare `bar' as static variable */ | |
573 | static int foo (lose); /* redeclare `foo' as function */ | |
574 | ||
575 | Unfortunately, the name being declared is separated from the | |
576 | declaration construct itself by a complicated syntactic structure--the | |
577 | "declarator". | |
578 | ||
579 | As a result, part of the Bison parser for C needs to be duplicated, | |
580 | with all the nonterminal names changed: once for parsing a declaration | |
581 | in which a typedef name can be redefined, and once for parsing a | |
582 | declaration in which that can't be done. Here is a part of the | |
583 | duplication, with actions omitted for brevity: | |
584 | ||
585 | initdcl: | |
586 | declarator maybeasm '=' | |
587 | init | |
588 | | declarator maybeasm | |
589 | ; | |
590 | ||
591 | notype_initdcl: | |
592 | notype_declarator maybeasm '=' | |
593 | init | |
594 | | notype_declarator maybeasm | |
595 | ; | |
596 | ||
597 | Here `initdcl' can redeclare a typedef name, but `notype_initdcl' | |
598 | cannot. The distinction between `declarator' and `notype_declarator' | |
599 | is the same sort of thing. | |
600 | ||
601 | There is some similarity between this technique and a lexical tie-in | |
602 | (described next), in that information which alters the lexical analysis | |
603 | is changed during parsing by other parts of the program. The | |
604 | difference is here the information is global, and is used for other | |
605 | purposes in the program. A true lexical tie-in has a special-purpose | |
606 | flag controlled by the syntactic context. | |
607 | ||
608 | \1f | |
609 | File: bison.info, Node: Lexical Tie-ins, Next: Tie-in Recovery, Prev: Semantic Tokens, Up: Context Dependency | |
610 | ||
611 | Lexical Tie-ins | |
612 | =============== | |
613 | ||
614 | One way to handle context-dependency is the "lexical tie-in": a flag | |
615 | which is set by Bison actions, whose purpose is to alter the way tokens | |
616 | are parsed. | |
617 | ||
618 | For example, suppose we have a language vaguely like C, but with a | |
619 | special construct `hex (HEX-EXPR)'. After the keyword `hex' comes an | |
620 | expression in parentheses in which all integers are hexadecimal. In | |
621 | particular, the token `a1b' must be treated as an integer rather than | |
622 | as an identifier if it appears in that context. Here is how you can do | |
623 | it: | |
624 | ||
625 | %{ | |
626 | int hexflag; | |
627 | %} | |
628 | %% | |
629 | ... | |
630 | expr: IDENTIFIER | |
631 | | constant | |
632 | | HEX '(' | |
633 | { hexflag = 1; } | |
634 | expr ')' | |
635 | { hexflag = 0; | |
636 | $$ = $4; } | |
637 | | expr '+' expr | |
638 | { $$ = make_sum ($1, $3); } | |
639 | ... | |
640 | ; | |
641 | ||
642 | constant: | |
643 | INTEGER | |
644 | | STRING | |
645 | ; | |
646 | ||
647 | Here we assume that `yylex' looks at the value of `hexflag'; when it is | |
648 | nonzero, all integers are parsed in hexadecimal, and tokens starting | |
649 | with letters are parsed as integers if possible. | |
650 | ||
651 | The declaration of `hexflag' shown in the C declarations section of | |
652 | the parser file is needed to make it accessible to the actions (*note | |
653 | The C Declarations Section: C Declarations.). You must also write the | |
654 | code in `yylex' to obey the flag. | |
655 | ||
656 | \1f | |
657 | File: bison.info, Node: Tie-in Recovery, Prev: Lexical Tie-ins, Up: Context Dependency | |
658 | ||
659 | Lexical Tie-ins and Error Recovery | |
660 | ================================== | |
661 | ||
662 | Lexical tie-ins make strict demands on any error recovery rules you | |
663 | have. *Note Error Recovery::. | |
664 | ||
665 | The reason for this is that the purpose of an error recovery rule is | |
666 | to abort the parsing of one construct and resume in some larger | |
667 | construct. For example, in C-like languages, a typical error recovery | |
668 | rule is to skip tokens until the next semicolon, and then start a new | |
669 | statement, like this: | |
670 | ||
671 | stmt: expr ';' | |
672 | | IF '(' expr ')' stmt { ... } | |
673 | ... | |
674 | error ';' | |
675 | { hexflag = 0; } | |
676 | ; | |
677 | ||
678 | If there is a syntax error in the middle of a `hex (EXPR)' | |
679 | construct, this error rule will apply, and then the action for the | |
680 | completed `hex (EXPR)' will never run. So `hexflag' would remain set | |
681 | for the entire rest of the input, or until the next `hex' keyword, | |
682 | causing identifiers to be misinterpreted as integers. | |
683 | ||
684 | To avoid this problem the error recovery rule itself clears | |
685 | `hexflag'. | |
686 | ||
687 | There may also be an error recovery rule that works within | |
688 | expressions. For example, there could be a rule which applies within | |
689 | parentheses and skips to the close-parenthesis: | |
690 | ||
691 | expr: ... | |
692 | | '(' expr ')' | |
693 | { $$ = $2; } | |
694 | | '(' error ')' | |
695 | ... | |
696 | ||
697 | If this rule acts within the `hex' construct, it is not going to | |
698 | abort that construct (since it applies to an inner level of parentheses | |
699 | within the construct). Therefore, it should not clear the flag: the | |
700 | rest of the `hex' construct should be parsed with the flag still in | |
701 | effect. | |
702 | ||
703 | What if there is an error recovery rule which might abort out of the | |
704 | `hex' construct or might not, depending on circumstances? There is no | |
705 | way you can write the action to determine whether a `hex' construct is | |
706 | being aborted or not. So if you are using a lexical tie-in, you had | |
707 | better make sure your error recovery rules are not of this kind. Each | |
708 | rule must be such that you can be sure that it always will, or always | |
709 | won't, have to clear the flag. | |
710 | ||
711 | \1f | |
712 | File: bison.info, Node: Debugging, Next: Invocation, Prev: Context Dependency, Up: Top | |
713 | ||
714 | Debugging Your Parser | |
715 | ********************* | |
716 | ||
717 | If a Bison grammar compiles properly but doesn't do what you want | |
718 | when it runs, the `yydebug' parser-trace feature can help you figure | |
719 | out why. | |
720 | ||
721 | To enable compilation of trace facilities, you must define the macro | |
722 | `YYDEBUG' when you compile the parser. You could use `-DYYDEBUG=1' as | |
723 | a compiler option or you could put `#define YYDEBUG 1' in the C | |
724 | declarations section of the grammar file (*note The C Declarations | |
725 | Section: C Declarations.). Alternatively, use the `-t' option when you | |
726 | run Bison (*note Invoking Bison: Invocation.). We always define | |
727 | `YYDEBUG' so that debugging is always possible. | |
728 | ||
729 | The trace facility uses `stderr', so you must add | |
730 | `#include <stdio.h>' to the C declarations section unless it is already | |
731 | there. | |
732 | ||
733 | Once you have compiled the program with trace facilities, the way to | |
734 | request a trace is to store a nonzero value in the variable `yydebug'. | |
735 | You can do this by making the C code do it (in `main', perhaps), or you | |
736 | can alter the value with a C debugger. | |
737 | ||
738 | Each step taken by the parser when `yydebug' is nonzero produces a | |
739 | line or two of trace information, written on `stderr'. The trace | |
740 | messages tell you these things: | |
741 | ||
742 | * Each time the parser calls `yylex', what kind of token was read. | |
743 | ||
744 | * Each time a token is shifted, the depth and complete contents of | |
745 | the state stack (*note Parser States::). | |
746 | ||
747 | * Each time a rule is reduced, which rule it is, and the complete | |
748 | contents of the state stack afterward. | |
749 | ||
750 | To make sense of this information, it helps to refer to the listing | |
751 | file produced by the Bison `-v' option (*note Invoking Bison: | |
752 | Invocation.). This file shows the meaning of each state in terms of | |
753 | positions in various rules, and also what each state will do with each | |
754 | possible input token. As you read the successive trace messages, you | |
755 | can see that the parser is functioning according to its specification | |
756 | in the listing file. Eventually you will arrive at the place where | |
757 | something undesirable happens, and you will see which parts of the | |
758 | grammar are to blame. | |
759 | ||
760 | The parser file is a C program and you can use C debuggers on it, | |
761 | but it's not easy to interpret what it is doing. The parser function | |
762 | is a finite-state machine interpreter, and aside from the actions it | |
763 | executes the same code over and over. Only the values of variables | |
764 | show where in the grammar it is working. | |
765 | ||
766 | The debugging information normally gives the token type of each token | |
767 | read, but not its semantic value. You can optionally define a macro | |
768 | named `YYPRINT' to provide a way to print the value. If you define | |
769 | `YYPRINT', it should take three arguments. The parser will pass a | |
770 | standard I/O stream, the numeric code for the token type, and the token | |
771 | value (from `yylval'). | |
772 | ||
773 | Here is an example of `YYPRINT' suitable for the multi-function | |
774 | calculator (*note Declarations for `mfcalc': Mfcalc Decl.): | |
775 | ||
776 | #define YYPRINT(file, type, value) yyprint (file, type, value) | |
777 | ||
778 | static void | |
779 | yyprint (FILE *file, int type, YYSTYPE value) | |
780 | { | |
781 | if (type == VAR) | |
782 | fprintf (file, " %s", value.tptr->name); | |
783 | else if (type == NUM) | |
784 | fprintf (file, " %d", value.val); | |
785 | } | |
786 | ||
787 | \1f | |
788 | File: bison.info, Node: Invocation, Next: Table of Symbols, Prev: Debugging, Up: Top | |
789 | ||
790 | Invoking Bison | |
791 | ************** | |
792 | ||
793 | The usual way to invoke Bison is as follows: | |
794 | ||
795 | bison INFILE | |
796 | ||
797 | Here INFILE is the grammar file name, which usually ends in `.y'. | |
798 | The parser file's name is made by replacing the `.y' with `.tab.c'. | |
799 | Thus, the `bison foo.y' filename yields `foo.tab.c', and the `bison | |
800 | hack/foo.y' filename yields `hack/foo.tab.c'. | |
801 | ||
802 | * Menu: | |
803 | ||
804 | * Bison Options:: All the options described in detail, | |
805 | in alphabetical order by short options. | |
806 | * Environment Variables:: Variables which affect Bison execution. | |
807 | * Option Cross Key:: Alphabetical list of long options. | |
808 | * VMS Invocation:: Bison command syntax on VMS. | |
809 | ||
810 | \1f | |
811 | File: bison.info, Node: Bison Options, Next: Environment Variables, Up: Invocation | |
812 | ||
813 | Bison Options | |
814 | ============= | |
815 | ||
816 | Bison supports both traditional single-letter options and mnemonic | |
817 | long option names. Long option names are indicated with `--' instead of | |
818 | `-'. Abbreviations for option names are allowed as long as they are | |
819 | unique. When a long option takes an argument, like `--file-prefix', | |
820 | connect the option name and the argument with `='. | |
821 | ||
822 | Here is a list of options that can be used with Bison, alphabetized | |
823 | by short option. It is followed by a cross key alphabetized by long | |
824 | option. | |
825 | ||
826 | Operations modes: | |
827 | `-h' | |
828 | `--help' | |
829 | Print a summary of the command-line options to Bison and exit. | |
830 | ||
831 | `-V' | |
832 | `--version' | |
833 | Print the version number of Bison and exit. | |
834 | ||
835 | `-y' | |
836 | `--yacc' | |
837 | `--fixed-output-files' | |
838 | Equivalent to `-o y.tab.c'; the parser output file is called | |
839 | `y.tab.c', and the other outputs are called `y.output' and | |
840 | `y.tab.h'. The purpose of this option is to imitate Yacc's output | |
841 | file name conventions. Thus, the following shell script can | |
842 | substitute for Yacc: | |
843 | ||
844 | bison -y $* | |
845 | ||
846 | Tuning the parser: | |
847 | ||
cd5bd6ac AD |
848 | `-S FILE' |
849 | `--skeleton=FILE' | |
850 | Specify the skeleton to use. You probably don't need this option | |
851 | unless you are developing Bison. | |
852 | ||
705db0b5 AD |
853 | `-t' |
854 | `--debug' | |
6deb4447 AD |
855 | Output a definition of the macro `YYDEBUG' into the parser file, so |
856 | that the debugging facilities are compiled. *Note Debugging Your | |
857 | Parser: Debugging. | |
705db0b5 AD |
858 | |
859 | `--locations' | |
860 | Pretend that `%locactions' was specified. *Note Decl Summary::. | |
861 | ||
862 | `-p PREFIX' | |
863 | `--name-prefix=PREFIX' | |
864 | Rename the external symbols used in the parser so that they start | |
865 | with PREFIX instead of `yy'. The precise list of symbols renamed | |
866 | is `yyparse', `yylex', `yyerror', `yynerrs', `yylval', `yychar' | |
867 | and `yydebug'. | |
868 | ||
869 | For example, if you use `-p c', the names become `cparse', `clex', | |
870 | and so on. | |
871 | ||
872 | *Note Multiple Parsers in the Same Program: Multiple Parsers. | |
873 | ||
874 | `-l' | |
875 | `--no-lines' | |
876 | Don't put any `#line' preprocessor commands in the parser file. | |
877 | Ordinarily Bison puts them in the parser file so that the C | |
878 | compiler and debuggers will associate errors with your source | |
879 | file, the grammar file. This option causes them to associate | |
880 | errors with the parser file, treating it as an independent source | |
881 | file in its own right. | |
882 | ||
883 | `-n' | |
884 | `--no-parser' | |
6deb4447 | 885 | Pretend that `%no_parser' was specified. *Note Decl Summary::. |
705db0b5 AD |
886 | |
887 | `-r' | |
888 | `--raw' | |
889 | Pretend that `%raw' was specified. *Note Decl Summary::. | |
890 | ||
891 | `-k' | |
892 | `--token-table' | |
893 | Pretend that `%token_table' was specified. *Note Decl Summary::. | |
894 | ||
895 | Adjust the output: | |
896 | ||
897 | `-d' | |
898 | `--defines' | |
6deb4447 AD |
899 | Pretend that `%verbose' was specified, i.e., write an extra output |
900 | file containing macro definitions for the token type names defined | |
901 | in the grammar and the semantic value type `YYSTYPE', as well as a | |
902 | few `extern' variable declarations. *Note Decl Summary::. | |
705db0b5 AD |
903 | |
904 | `-b FILE-PREFIX' | |
905 | `--file-prefix=PREFIX' | |
906 | Specify a prefix to use for all Bison output file names. The | |
907 | names are chosen as if the input file were named `PREFIX.c'. | |
908 | ||
909 | `-v' | |
910 | `--verbose' | |
6deb4447 AD |
911 | Pretend that `%verbose' was specified, i.e, write an extra output |
912 | file containing verbose descriptions of the grammar and parser. | |
913 | *Note Decl Summary::, for more. | |
705db0b5 AD |
914 | |
915 | `-o OUTFILE' | |
916 | `--output-file=OUTFILE' | |
917 | Specify the name OUTFILE for the parser file. | |
918 | ||
919 | The other output files' names are constructed from OUTFILE as | |
920 | described under the `-v' and `-d' options. | |
921 | ||
922 | \1f | |
923 | File: bison.info, Node: Environment Variables, Next: Option Cross Key, Prev: Bison Options, Up: Invocation | |
924 | ||
925 | Environment Variables | |
926 | ===================== | |
927 | ||
928 | Here is a list of environment variables which affect the way Bison | |
929 | runs. | |
930 | ||
931 | `BISON_SIMPLE' | |
932 | `BISON_HAIRY' | |
933 | Much of the parser generated by Bison is copied verbatim from a | |
934 | file called `bison.simple'. If Bison cannot find that file, or if | |
935 | you would like to direct Bison to use a different copy, setting the | |
936 | environment variable `BISON_SIMPLE' to the path of the file will | |
937 | cause Bison to use that copy instead. | |
938 | ||
939 | When the `%semantic_parser' declaration is used, Bison copies from | |
940 | a file called `bison.hairy' instead. The location of this file can | |
941 | also be specified or overridden in a similar fashion, with the | |
942 | `BISON_HAIRY' environment variable. | |
943 | ||
944 | \1f | |
945 | File: bison.info, Node: Option Cross Key, Next: VMS Invocation, Prev: Environment Variables, Up: Invocation | |
946 | ||
947 | Option Cross Key | |
948 | ================ | |
949 | ||
950 | Here is a list of options, alphabetized by long option, to help you | |
951 | find the corresponding short option. | |
952 | ||
953 | --debug -t | |
954 | --defines -d | |
955 | --file-prefix=PREFIX -b FILE-PREFIX | |
956 | --fixed-output-files --yacc -y | |
957 | --help -h | |
958 | --name-prefix=PREFIX -p NAME-PREFIX | |
959 | --no-lines -l | |
960 | --no-parser -n | |
961 | --output-file=OUTFILE -o OUTFILE | |
962 | --raw -r | |
963 | --token-table -k | |
964 | --verbose -v | |
965 | --version -V | |
966 | ||
967 | \1f | |
968 | File: bison.info, Node: VMS Invocation, Prev: Option Cross Key, Up: Invocation | |
969 | ||
970 | Invoking Bison under VMS | |
971 | ======================== | |
972 | ||
973 | The command line syntax for Bison on VMS is a variant of the usual | |
974 | Bison command syntax--adapted to fit VMS conventions. | |
975 | ||
976 | To find the VMS equivalent for any Bison option, start with the long | |
977 | option, and substitute a `/' for the leading `--', and substitute a `_' | |
978 | for each `-' in the name of the long option. For example, the | |
979 | following invocation under VMS: | |
980 | ||
981 | bison /debug/name_prefix=bar foo.y | |
982 | ||
983 | is equivalent to the following command under POSIX. | |
984 | ||
985 | bison --debug --name-prefix=bar foo.y | |
986 | ||
987 | The VMS file system does not permit filenames such as `foo.tab.c'. | |
988 | In the above example, the output file would instead be named | |
989 | `foo_tab.c'. | |
990 | ||
991 | \1f | |
992 | File: bison.info, Node: Table of Symbols, Next: Glossary, Prev: Invocation, Up: Top | |
993 | ||
994 | Bison Symbols | |
995 | ************* | |
996 | ||
997 | `error' | |
998 | A token name reserved for error recovery. This token may be used | |
999 | in grammar rules so as to allow the Bison parser to recognize an | |
1000 | error in the grammar without halting the process. In effect, a | |
1001 | sentence containing an error may be recognized as valid. On a | |
1002 | parse error, the token `error' becomes the current look-ahead | |
1003 | token. Actions corresponding to `error' are then executed, and | |
1004 | the look-ahead token is reset to the token that originally caused | |
1005 | the violation. *Note Error Recovery::. | |
1006 | ||
1007 | `YYABORT' | |
1008 | Macro to pretend that an unrecoverable syntax error has occurred, | |
1009 | by making `yyparse' return 1 immediately. The error reporting | |
1010 | function `yyerror' is not called. *Note The Parser Function | |
1011 | `yyparse': Parser Function. | |
1012 | ||
1013 | `YYACCEPT' | |
1014 | Macro to pretend that a complete utterance of the language has been | |
1015 | read, by making `yyparse' return 0 immediately. *Note The Parser | |
1016 | Function `yyparse': Parser Function. | |
1017 | ||
1018 | `YYBACKUP' | |
1019 | Macro to discard a value from the parser stack and fake a | |
1020 | look-ahead token. *Note Special Features for Use in Actions: | |
1021 | Action Features. | |
1022 | ||
1023 | `YYERROR' | |
1024 | Macro to pretend that a syntax error has just been detected: call | |
1025 | `yyerror' and then perform normal error recovery if possible | |
1026 | (*note Error Recovery::), or (if recovery is impossible) make | |
1027 | `yyparse' return 1. *Note Error Recovery::. | |
1028 | ||
1029 | `YYERROR_VERBOSE' | |
1030 | Macro that you define with `#define' in the Bison declarations | |
1031 | section to request verbose, specific error message strings when | |
1032 | `yyerror' is called. | |
1033 | ||
1034 | `YYINITDEPTH' | |
1035 | Macro for specifying the initial size of the parser stack. *Note | |
1036 | Stack Overflow::. | |
1037 | ||
1038 | `YYLEX_PARAM' | |
1039 | Macro for specifying an extra argument (or list of extra | |
1040 | arguments) for `yyparse' to pass to `yylex'. *Note Calling | |
1041 | Conventions for Pure Parsers: Pure Calling. | |
1042 | ||
1043 | `YYLTYPE' | |
1044 | Macro for the data type of `yylloc'; a structure with four | |
1045 | members. *Note Textual Positions of Tokens: Token Positions. | |
1046 | ||
1047 | `yyltype' | |
1048 | Default value for YYLTYPE. | |
1049 | ||
1050 | `YYMAXDEPTH' | |
1051 | Macro for specifying the maximum size of the parser stack. *Note | |
1052 | Stack Overflow::. | |
1053 | ||
1054 | `YYPARSE_PARAM' | |
1055 | Macro for specifying the name of a parameter that `yyparse' should | |
1056 | accept. *Note Calling Conventions for Pure Parsers: Pure Calling. | |
1057 | ||
1058 | `YYRECOVERING' | |
1059 | Macro whose value indicates whether the parser is recovering from a | |
1060 | syntax error. *Note Special Features for Use in Actions: Action | |
1061 | Features. | |
1062 | ||
1063 | `YYSTYPE' | |
1064 | Macro for the data type of semantic values; `int' by default. | |
1065 | *Note Data Types of Semantic Values: Value Type. | |
1066 | ||
1067 | `yychar' | |
1068 | External integer variable that contains the integer value of the | |
1069 | current look-ahead token. (In a pure parser, it is a local | |
1070 | variable within `yyparse'.) Error-recovery rule actions may | |
1071 | examine this variable. *Note Special Features for Use in Actions: | |
1072 | Action Features. | |
1073 | ||
1074 | `yyclearin' | |
1075 | Macro used in error-recovery rule actions. It clears the previous | |
1076 | look-ahead token. *Note Error Recovery::. | |
1077 | ||
1078 | `yydebug' | |
1079 | External integer variable set to zero by default. If `yydebug' is | |
1080 | given a nonzero value, the parser will output information on input | |
1081 | symbols and parser action. *Note Debugging Your Parser: Debugging. | |
1082 | ||
1083 | `yyerrok' | |
1084 | Macro to cause parser to recover immediately to its normal mode | |
1085 | after a parse error. *Note Error Recovery::. | |
1086 | ||
1087 | `yyerror' | |
1088 | User-supplied function to be called by `yyparse' on error. The | |
1089 | function receives one argument, a pointer to a character string | |
1090 | containing an error message. *Note The Error Reporting Function | |
1091 | `yyerror': Error Reporting. | |
1092 | ||
1093 | `yylex' | |
1094 | User-supplied lexical analyzer function, called with no arguments | |
1095 | to get the next token. *Note The Lexical Analyzer Function | |
1096 | `yylex': Lexical. | |
1097 | ||
1098 | `yylval' | |
1099 | External variable in which `yylex' should place the semantic value | |
1100 | associated with a token. (In a pure parser, it is a local | |
1101 | variable within `yyparse', and its address is passed to `yylex'.) | |
1102 | *Note Semantic Values of Tokens: Token Values. | |
1103 | ||
1104 | `yylloc' | |
1105 | External variable in which `yylex' should place the line and column | |
1106 | numbers associated with a token. (In a pure parser, it is a local | |
1107 | variable within `yyparse', and its address is passed to `yylex'.) | |
1108 | You can ignore this variable if you don't use the `@' feature in | |
1109 | the grammar actions. *Note Textual Positions of Tokens: Token | |
1110 | Positions. | |
1111 | ||
1112 | `yynerrs' | |
1113 | Global variable which Bison increments each time there is a parse | |
1114 | error. (In a pure parser, it is a local variable within | |
1115 | `yyparse'.) *Note The Error Reporting Function `yyerror': Error | |
1116 | Reporting. | |
1117 | ||
1118 | `yyparse' | |
1119 | The parser function produced by Bison; call this function to start | |
1120 | parsing. *Note The Parser Function `yyparse': Parser Function. | |
1121 | ||
6deb4447 AD |
1122 | `%debug' |
1123 | Equip the parser for debugging. *Note Decl Summary::. | |
1124 | ||
1125 | `%defines' | |
1126 | Bison declaration to create a header file meant for the scanner. | |
1127 | *Note Decl Summary::. | |
1128 | ||
705db0b5 AD |
1129 | `%left' |
1130 | Bison declaration to assign left associativity to token(s). *Note | |
1131 | Operator Precedence: Precedence Decl. | |
1132 | ||
1133 | `%no_lines' | |
1134 | Bison declaration to avoid generating `#line' directives in the | |
1135 | parser file. *Note Decl Summary::. | |
1136 | ||
1137 | `%nonassoc' | |
1138 | Bison declaration to assign non-associativity to token(s). *Note | |
1139 | Operator Precedence: Precedence Decl. | |
1140 | ||
1141 | `%prec' | |
1142 | Bison declaration to assign a precedence to a specific rule. | |
1143 | *Note Context-Dependent Precedence: Contextual Precedence. | |
1144 | ||
1145 | `%pure_parser' | |
1146 | Bison declaration to request a pure (reentrant) parser. *Note A | |
1147 | Pure (Reentrant) Parser: Pure Decl. | |
1148 | ||
1149 | `%raw' | |
1150 | Bison declaration to use Bison internal token code numbers in token | |
1151 | tables instead of the usual Yacc-compatible token code numbers. | |
1152 | *Note Decl Summary::. | |
1153 | ||
1154 | `%right' | |
1155 | Bison declaration to assign right associativity to token(s). | |
1156 | *Note Operator Precedence: Precedence Decl. | |
1157 | ||
1158 | `%start' | |
1159 | Bison declaration to specify the start symbol. *Note The | |
1160 | Start-Symbol: Start Decl. | |
1161 | ||
1162 | `%token' | |
1163 | Bison declaration to declare token(s) without specifying | |
1164 | precedence. *Note Token Type Names: Token Decl. | |
1165 | ||
1166 | `%token_table' | |
1167 | Bison declaration to include a token name table in the parser file. | |
1168 | *Note Decl Summary::. | |
1169 | ||
1170 | `%type' | |
1171 | Bison declaration to declare nonterminals. *Note Nonterminal | |
1172 | Symbols: Type Decl. | |
1173 | ||
1174 | `%union' | |
1175 | Bison declaration to specify several possible data types for | |
1176 | semantic values. *Note The Collection of Value Types: Union Decl. | |
1177 | ||
1178 | These are the punctuation and delimiters used in Bison input: | |
1179 | ||
1180 | `%%' | |
1181 | Delimiter used to separate the grammar rule section from the Bison | |
1182 | declarations section or the additional C code section. *Note The | |
1183 | Overall Layout of a Bison Grammar: Grammar Layout. | |
1184 | ||
1185 | `%{ %}' | |
1186 | All code listed between `%{' and `%}' is copied directly to the | |
1187 | output file uninterpreted. Such code forms the "C declarations" | |
1188 | section of the input file. *Note Outline of a Bison Grammar: | |
1189 | Grammar Outline. | |
1190 | ||
1191 | `/*...*/' | |
1192 | Comment delimiters, as in C. | |
1193 | ||
1194 | `:' | |
1195 | Separates a rule's result from its components. *Note Syntax of | |
1196 | Grammar Rules: Rules. | |
1197 | ||
1198 | `;' | |
1199 | Terminates a rule. *Note Syntax of Grammar Rules: Rules. | |
1200 | ||
1201 | `|' | |
1202 | Separates alternate rules for the same result nonterminal. *Note | |
1203 | Syntax of Grammar Rules: Rules. | |
1204 |