Here is a simple C function subdivided into tokens:
-@ifinfo
@example
int /* @r{keyword `int'} */
square (int x) /* @r{identifier, open-paren, keyword `int',}
@r{identifier, semicolon} */
@} /* @r{close-brace} */
@end example
-@end ifinfo
-@ifnotinfo
-@example
-int /* @r{keyword `int'} */
-square (int x) /* @r{identifier, open-paren, keyword `int', identifier, close-paren} */
-@{ /* @r{open-brace} */
- return x * x; /* @r{keyword `return', identifier, asterisk, identifier, semicolon} */
-@} /* @r{close-brace} */
-@end example
-@end ifnotinfo
The syntactic groupings of C include the expression, the statement, the
declaration, and the function definition. These are represented in the
parse. Finally, the base of the temporary stack used during an exploratory
parse is a pointer into the normal parser state stack so that the stack is
never physically copied. In our experience, the performance penalty of LAC
-has proven insignificant for practical grammars.
+has proved insignificant for practical grammars.
@end itemize
While the LAC algorithm shares techniques that have been recognized in the
For example:
@example
-stmnts:
+stmts:
/* empty string */
-| stmnts '\n'
-| stmnts exp '\n'
-| stmnts error '\n'
+| stmts '\n'
+| stmts exp '\n'
+| stmts error '\n'
@end example
The fourth rule in this example says that an error followed by a newline
-makes a valid addition to any @code{stmnts}.
+makes a valid addition to any @code{stmts}.
What happens if a syntax error occurs in the middle of an @code{exp}? The
error recovery rule, interpreted strictly, applies to the precise sequence
-of a @code{stmnts}, an @code{error} and a newline. If an error occurs in
+of a @code{stmts}, an @code{error} and a newline. If an error occurs in
the middle of an @code{exp}, there will probably be some additional tokens
-and subexpressions on the stack after the last @code{stmnts}, and there
+and subexpressions on the stack after the last @code{stmts}, and there
will be tokens to read before the next newline. So the rule is not
applicable in the ordinary way.
the semantic context and part of the input. First it discards states
and objects from the stack until it gets back to a state in which the
@code{error} token is acceptable. (This means that the subexpressions
-already parsed are discarded, back to the last complete @code{stmnts}.)
+already parsed are discarded, back to the last complete @code{stmts}.)
At this point the @code{error} token can be shifted. Then, if the old
lookahead token is not acceptable to be shifted next, the parser reads
tokens and discards them until it finds a token which is acceptable. In
the current input line or current statement if an error is detected:
@example
-stmnt: error ';' /* On error, skip until ';' is read. */
+stmt: error ';' /* On error, skip until ';' is read. */
@end example
It is also useful to recover to the matching close-delimiter of an
Error recovery strategies are necessarily guesses. When they guess wrong,
one syntax error often leads to another. In the above example, the error
recovery rule guesses that an error is due to bad input within one
-@code{stmnt}. Suppose that instead a spurious semicolon is inserted in the
-middle of a valid @code{stmnt}. After the error recovery rule recovers
+@code{stmt}. Suppose that instead a spurious semicolon is inserted in the
+middle of a valid @code{stmt}. After the error recovery rule recovers
from the first error, another syntax error will be found straightaway,
since the text following the spurious semicolon is also an invalid
-@code{stmnt}.
+@code{stmt}.
To prevent an outpouring of error messages, the parser will output no error
message for another syntax error that happens shortly after the first; only
order of the output and the exact presentation might vary, but the
interpretation is the same.
-The first section includes details on conflicts that were solved thanks
-to precedence and/or associativity:
-
-@example
-Conflict in state 8 between rule 2 and token '+' resolved as reduce.
-Conflict in state 8 between rule 2 and token '-' resolved as reduce.
-Conflict in state 8 between rule 2 and token '*' resolved as shift.
-@exdent @dots{}
-@end example
-
-@noindent
-The next section lists states that still have conflicts.
-
-@example
-State 8 conflicts: 1 shift/reduce
-State 9 conflicts: 1 shift/reduce
-State 10 conflicts: 1 shift/reduce
-State 11 conflicts: 4 shift/reduce
-@end example
-
@noindent
@cindex token, useless
@cindex useless token
@cindex useless nonterminal
@cindex rule, useless
@cindex useless rule
-The next section reports useless tokens, nonterminal and rules. Useless
-nonterminals and rules are removed in order to produce a smaller parser,
-but useless tokens are preserved, since they might be used by the
-scanner (note the difference between ``useless'' and ``unused''
-below):
+The first section reports useless tokens, nonterminals and rules. Useless
+nonterminals and rules are removed in order to produce a smaller parser, but
+useless tokens are preserved, since they might be used by the scanner (note
+the difference between ``useless'' and ``unused'' below):
@example
-Nonterminals useless in grammar:
+Nonterminals useless in grammar
useless
-Terminals unused in grammar:
+Terminals unused in grammar
STR
-Rules useless in grammar:
-#6 useless: STR;
+Rules useless in grammar
+ 6 useless: STR
@end example
@noindent
-The next section reproduces the exact grammar that Bison used:
+The next section lists states that still have conflicts.
+
+@example
+State 8 conflicts: 1 shift/reduce
+State 9 conflicts: 1 shift/reduce
+State 10 conflicts: 1 shift/reduce
+State 11 conflicts: 4 shift/reduce
+@end example
+
+@noindent
+Then Bison reproduces the exact grammar it used:
@example
Grammar
- Number, Line, Rule
- 0 5 $accept -> exp $end
- 1 5 exp -> exp '+' exp
- 2 6 exp -> exp '-' exp
- 3 7 exp -> exp '*' exp
- 4 8 exp -> exp '/' exp
- 5 9 exp -> NUM
+ 0 $accept: exp $end
+
+ 1 exp: exp '+' exp
+ 2 | exp '-' exp
+ 3 | exp '*' exp
+ 4 | exp '/' exp
+ 5 | NUM
@end example
@noindent
'/' (47) 4
error (256)
NUM (258) 5
+STR (259)
@end group
@group
Nonterminals, with rules where they appear
-$accept (8)
+$accept (9)
on left: 0
-exp (9)
+exp (10)
on left: 1 2 3 4 5, on right: 0 1 2 3 4
@end group
@end example
@example
state 0
- $accept -> . exp $ (rule 0)
+ 0 $accept: . exp $end
- NUM shift, and go to state 1
+ NUM shift, and go to state 1
- exp go to state 2
+ exp go to state 2
@end example
This reads as follows: ``state 0 corresponds to being at the very
@example
state 0
- $accept -> . exp $ (rule 0)
- exp -> . exp '+' exp (rule 1)
- exp -> . exp '-' exp (rule 2)
- exp -> . exp '*' exp (rule 3)
- exp -> . exp '/' exp (rule 4)
- exp -> . NUM (rule 5)
+ 0 $accept: . exp $end
+ 1 exp: . exp '+' exp
+ 2 | . exp '-' exp
+ 3 | . exp '*' exp
+ 4 | . exp '/' exp
+ 5 | . NUM
- NUM shift, and go to state 1
+ NUM shift, and go to state 1
- exp go to state 2
+ exp go to state 2
@end example
@noindent
-In the state 1...
+In the state 1@dots{}
@example
state 1
- exp -> NUM . (rule 5)
+ 5 exp: NUM .
- $default reduce using rule 5 (exp)
+ $default reduce using rule 5 (exp)
@end example
@noindent
@example
state 2
- $accept -> exp . $ (rule 0)
- exp -> exp . '+' exp (rule 1)
- exp -> exp . '-' exp (rule 2)
- exp -> exp . '*' exp (rule 3)
- exp -> exp . '/' exp (rule 4)
+ 0 $accept: exp . $end
+ 1 exp: exp . '+' exp
+ 2 | exp . '-' exp
+ 3 | exp . '*' exp
+ 4 | exp . '/' exp
- $ shift, and go to state 3
- '+' shift, and go to state 4
- '-' shift, and go to state 5
- '*' shift, and go to state 6
- '/' shift, and go to state 7
+ $end shift, and go to state 3
+ '+' shift, and go to state 4
+ '-' shift, and go to state 5
+ '*' shift, and go to state 6
+ '/' shift, and go to state 7
@end example
@noindent
In state 2, the automaton can only shift a symbol. For instance,
-because of the item @samp{exp -> exp . '+' exp}, if the lookahead is
+because of the item @samp{exp: exp . '+' exp}, if the lookahead is
@samp{+} it is shifted onto the parse stack, and the automaton
-jumps to state 4, corresponding to the item @samp{exp -> exp '+' . exp}.
+jumps to state 4, corresponding to the item @samp{exp: exp '+' . exp}.
Since there is no default action, any lookahead not listed triggers a syntax
error.
@example
state 3
- $accept -> exp $ . (rule 0)
+ 0 $accept: exp $end .
- $default accept
+ $default accept
@end example
@noindent
-the initial rule is completed (the start symbol and the end
-of input were read), the parsing exits successfully.
+the initial rule is completed (the start symbol and the end-of-input were
+read), the parsing exits successfully.
The interpretation of states 4 to 7 is straightforward, and is left to
the reader.
@example
state 4
- exp -> exp '+' . exp (rule 1)
+ 1 exp: exp '+' . exp
+
+ NUM shift, and go to state 1
- NUM shift, and go to state 1
+ exp go to state 8
- exp go to state 8
state 5
- exp -> exp '-' . exp (rule 2)
+ 2 exp: exp '-' . exp
- NUM shift, and go to state 1
+ NUM shift, and go to state 1
+
+ exp go to state 9
- exp go to state 9
state 6
- exp -> exp '*' . exp (rule 3)
+ 3 exp: exp '*' . exp
+
+ NUM shift, and go to state 1
- NUM shift, and go to state 1
+ exp go to state 10
- exp go to state 10
state 7
- exp -> exp '/' . exp (rule 4)
+ 4 exp: exp '/' . exp
- NUM shift, and go to state 1
+ NUM shift, and go to state 1
- exp go to state 11
+ exp go to state 11
@end example
As was announced in beginning of the report, @samp{State 8 conflicts:
@example
state 8
- exp -> exp . '+' exp (rule 1)
- exp -> exp '+' exp . (rule 1)
- exp -> exp . '-' exp (rule 2)
- exp -> exp . '*' exp (rule 3)
- exp -> exp . '/' exp (rule 4)
+ 1 exp: exp . '+' exp
+ 1 | exp '+' exp .
+ 2 | exp . '-' exp
+ 3 | exp . '*' exp
+ 4 | exp . '/' exp
- '*' shift, and go to state 6
- '/' shift, and go to state 7
+ '*' shift, and go to state 6
+ '/' shift, and go to state 7
- '/' [reduce using rule 1 (exp)]
- $default reduce using rule 1 (exp)
+ '/' [reduce using rule 1 (exp)]
+ $default reduce using rule 1 (exp)
@end example
Indeed, there are two actions associated to the lookahead @samp{/}:
Because in deterministic parsing a single decision can be made, Bison
arbitrarily chose to disable the reduction, see @ref{Shift/Reduce, ,
-Shift/Reduce Conflicts}. Discarded actions are reported in between
+Shift/Reduce Conflicts}. Discarded actions are reported between
square brackets.
Note that all the previous states had a single possible action: either
@example
state 8
- exp -> exp . '+' exp (rule 1)
- exp -> exp '+' exp . [$, '+', '-', '/'] (rule 1)
- exp -> exp . '-' exp (rule 2)
- exp -> exp . '*' exp (rule 3)
- exp -> exp . '/' exp (rule 4)
+ 1 exp: exp . '+' exp
+ 1 | exp '+' exp . [$end, '+', '-', '/']
+ 2 | exp . '-' exp
+ 3 | exp . '*' exp
+ 4 | exp . '/' exp
- '*' shift, and go to state 6
- '/' shift, and go to state 7
+ '*' shift, and go to state 6
+ '/' shift, and go to state 7
- '/' [reduce using rule 1 (exp)]
- $default reduce using rule 1 (exp)
+ '/' [reduce using rule 1 (exp)]
+ $default reduce using rule 1 (exp)
@end example
+Note however that while @samp{NUM + NUM / NUM} is ambiguous (which results in
+the conflicts on @samp{/}), @samp{NUM + NUM * NUM} is not: the conflict was
+solved thanks to associativity and precedence directives. If invoked with
+@option{--report=solved}, Bison includes information about the solved
+conflicts in the report:
+
+@example
+Conflict between rule 1 and token '+' resolved as reduce (%left '+').
+Conflict between rule 1 and token '-' resolved as reduce (%left '-').
+Conflict between rule 1 and token '*' resolved as shift ('+' < '*').
+@end example
+
+
The remaining states are similar:
@example
@group
state 9
- exp -> exp . '+' exp (rule 1)
- exp -> exp . '-' exp (rule 2)
- exp -> exp '-' exp . (rule 2)
- exp -> exp . '*' exp (rule 3)
- exp -> exp . '/' exp (rule 4)
+ 1 exp: exp . '+' exp
+ 2 | exp . '-' exp
+ 2 | exp '-' exp .
+ 3 | exp . '*' exp
+ 4 | exp . '/' exp
- '*' shift, and go to state 6
- '/' shift, and go to state 7
+ '*' shift, and go to state 6
+ '/' shift, and go to state 7
- '/' [reduce using rule 2 (exp)]
- $default reduce using rule 2 (exp)
+ '/' [reduce using rule 2 (exp)]
+ $default reduce using rule 2 (exp)
@end group
@group
state 10
- exp -> exp . '+' exp (rule 1)
- exp -> exp . '-' exp (rule 2)
- exp -> exp . '*' exp (rule 3)
- exp -> exp '*' exp . (rule 3)
- exp -> exp . '/' exp (rule 4)
+ 1 exp: exp . '+' exp
+ 2 | exp . '-' exp
+ 3 | exp . '*' exp
+ 3 | exp '*' exp .
+ 4 | exp . '/' exp
- '/' shift, and go to state 7
+ '/' shift, and go to state 7
- '/' [reduce using rule 3 (exp)]
- $default reduce using rule 3 (exp)
+ '/' [reduce using rule 3 (exp)]
+ $default reduce using rule 3 (exp)
@end group
@group
state 11
- exp -> exp . '+' exp (rule 1)
- exp -> exp . '-' exp (rule 2)
- exp -> exp . '*' exp (rule 3)
- exp -> exp . '/' exp (rule 4)
- exp -> exp '/' exp . (rule 4)
+ 1 exp: exp . '+' exp
+ 2 | exp . '-' exp
+ 3 | exp . '*' exp
+ 4 | exp . '/' exp
+ 4 | exp '/' exp .
- '+' shift, and go to state 4
- '-' shift, and go to state 5
- '*' shift, and go to state 6
- '/' shift, and go to state 7
+ '+' shift, and go to state 4
+ '-' shift, and go to state 5
+ '*' shift, and go to state 6
+ '/' shift, and go to state 7
- '+' [reduce using rule 4 (exp)]
- '-' [reduce using rule 4 (exp)]
- '*' [reduce using rule 4 (exp)]
- '/' [reduce using rule 4 (exp)]
- $default reduce using rule 4 (exp)
+ '+' [reduce using rule 4 (exp)]
+ '-' [reduce using rule 4 (exp)]
+ '*' [reduce using rule 4 (exp)]
+ '/' [reduce using rule 4 (exp)]
+ $default reduce using rule 4 (exp)
@end group
@end example
@c LocalWords: NUM exp subsubsection kbd Ctrl ctype EOF getchar isdigit nonfree
@c LocalWords: ungetc stdin scanf sc calc ulator ls lm cc NEG prec yyerrok rr
@c LocalWords: longjmp fprintf stderr yylloc YYLTYPE cos ln Stallman Destructor
-@c LocalWords: symrec val tptr FNCT fnctptr func struct sym enum
+@c LocalWords: symrec val tptr FNCT fnctptr func struct sym enum IEC syntaxes
@c LocalWords: fnct putsym getsym fname arith fncts atan ptr malloc sizeof Lex
@c LocalWords: strlen strcpy fctn strcmp isalpha symbuf realloc isalnum DOTDOT
@c LocalWords: ptypes itype YYPRINT trigraphs yytname expseq vindex dtype Unary
@c LocalWords: strncmp intval tindex lvalp locp llocp typealt YYBACKUP subrange
@c LocalWords: YYEMPTY YYEOF YYRECOVERING yyclearin GE def UMINUS maybeword loc
@c LocalWords: Johnstone Shamsa Sadaf Hussain Tomita TR uref YYMAXDEPTH inline
-@c LocalWords: YYINITDEPTH stmnts ref stmnt initdcl maybeasm notype Lookahead
+@c LocalWords: YYINITDEPTH stmts ref initdcl maybeasm notype Lookahead yyoutput
@c LocalWords: hexflag STR exdent itemset asis DYYDEBUG YYFPRINTF args Autoconf
@c LocalWords: infile ypp yxx outfile itemx tex leaderfill Troubleshouting sqrt
@c LocalWords: hbox hss hfill tt ly yyin fopen fclose ofirst gcc ll lookahead
@c LocalWords: nbar yytext fst snd osplit ntwo strdup AST Troublereporting th
@c LocalWords: YYSTACK DVI fdl printindex IELR nondeterministic nonterminals ps
@c LocalWords: subexpressions declarator nondeferred config libintl postfix LAC
-@c LocalWords: preprocessor nonpositive unary nonnumeric typedef extern rhs
-@c LocalWords: yytokentype destructor multicharacter nonnull EBCDIC
+@c LocalWords: preprocessor nonpositive unary nonnumeric typedef extern rhs sr
+@c LocalWords: yytokentype destructor multicharacter nonnull EBCDIC nterm LR's
@c LocalWords: lvalue nonnegative XNUM CHR chr TAGLESS tagless stdout api TOK
-@c LocalWords: destructors Reentrancy nonreentrant subgrammar nonassociative
+@c LocalWords: destructors Reentrancy nonreentrant subgrammar nonassociative Ph
@c LocalWords: deffnx namespace xml goto lalr ielr runtime lex yacc yyps env
@c LocalWords: yystate variadic Unshift NLS gettext po UTF Automake LOCALEDIR
@c LocalWords: YYENABLE bindtextdomain Makefile DEFS CPPFLAGS DBISON DeRemer
-@c LocalWords: autoreconf Pennello multisets nondeterminism Generalised baz
+@c LocalWords: autoreconf Pennello multisets nondeterminism Generalised baz ACM
@c LocalWords: redeclare automata Dparse localedir datadir XSLT midrule Wno
-@c LocalWords: Graphviz multitable headitem hh basename Doxygen fno
+@c LocalWords: Graphviz multitable headitem hh basename Doxygen fno filename
@c LocalWords: doxygen ival sval deftypemethod deallocate pos deftypemethodx
@c LocalWords: Ctor defcv defcvx arg accessors arithmetics CPP ifndef CALCXX
@c LocalWords: lexer's calcxx bool LPAREN RPAREN deallocation cerrno climits
@c LocalWords: cstdlib Debian undef yywrap unput noyywrap nounput zA yyleng
-@c LocalWords: errno strtol ERANGE str strerror iostream argc argv Javadoc
+@c LocalWords: errno strtol ERANGE str strerror iostream argc argv Javadoc PSLR
@c LocalWords: bytecode initializers superclass stype ASTNode autoboxing nls
@c LocalWords: toString deftypeivar deftypeivarx deftypeop YYParser strictfp
@c LocalWords: superclasses boolean getErrorVerbose setErrorVerbose deftypecv
@c LocalWords: getDebugStream setDebugStream getDebugLevel setDebugLevel url
@c LocalWords: bisonVersion deftypecvx bisonSkeleton getStartPos getEndPos
-@c LocalWords: getLVal defvar deftypefn deftypefnx gotos msgfmt Corbett
-@c LocalWords: subdirectory Solaris nonassociativity
+@c LocalWords: getLVal defvar deftypefn deftypefnx gotos msgfmt Corbett LALR's
+@c LocalWords: subdirectory Solaris nonassociativity perror schemas Malloy
+@c LocalWords: Scannerless ispell american
@c Local Variables:
@c ispell-dictionary: "american"