From: Joel E. Denny Date: Sun, 7 Nov 2010 21:01:56 +0000 (-0500) Subject: yysyntax_error: fix for consistent error with lookahead. X-Git-Tag: v2.5_rc1~54 X-Git-Url: https://git.saurik.com/bison.git/commitdiff_plain/095a1d11ca90852ad224bbf9823bf88b7fff9a57 yysyntax_error: fix for consistent error with lookahead. * NEWS (2.5): Document. * data/yacc.c (yysyntax_error): In a verbose syntax error message while in a consistent state with a default action (which must be an error action given that yysyntax_error is being invoked), continue to drop the expected token list, but don't drop the unexpected token unless there actually is no lookahead. Moreover, handle that internally instead of returning 1 to tell the caller to do it. With that meaning of 1 gone, renumber return codes more usefully. (yyparse, yypush_parse): Update yysyntax_error usage. Most importantly, set yytoken to YYEMPTY when there's no lookahead. * data/glr.c (yyreportSyntaxError): As in yacc.c, don't drop the unexpected token unless there actually is no lookahead. * data/lalr1.cc (yy::parser::parse): If there's no lookahead, set yytoken to yyempty_ before invoking yysyntax_error_. (yy::parser::yysyntax_error_): Again, don't drop the unexpected token unless there actually is no lookahead. * data/lalr1.java (YYParser::parse): If there's no lookahead, set yytoken to yyempty_ before invoking yysyntax_error. (YYParser::yysyntax_error): Again, don't drop the unexpected token unless there actually is no lookahead. * tests/conflicts.at (%error-verbose and consistent errors): Extend test group to further reveal how the previous use of the simple "syntax error" message was too general. Test yacc.c, glr.c, lalr1.cc, and lalr1.java. No longer an expected failure. * tests/java.at (AT_JAVA_COMPILE, AT_JAVA_PARSER_CHECK): Move to... * tests/local.at: ... here. (_AT_BISON_OPTION_PUSHDEFS): Push AT_SKEL_JAVA_IF definition. (AT_BISON_OPTION_POPDEFS): Pop it. (AT_FULL_COMPILE): Extend to handle Java. (cherry picked from commit d2060f0634f4adfb5db74cce540a9d27806091fe) Conflicts: data/lalr1.cc data/lalr1.java src/parse-gram.c src/parse-gram.h tests/java.at --- diff --git a/ChangeLog b/ChangeLog index 2c65a385..0e711e5e 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,39 @@ +2010-11-07 Joel E. Denny + + yysyntax_error: fix for consistent error with lookahead. + * NEWS (2.5): Document. + * data/yacc.c (yysyntax_error): In a verbose syntax error + message while in a consistent state with a default action (which + must be an error action given that yysyntax_error is being + invoked), continue to drop the expected token list, but don't + drop the unexpected token unless there actually is no lookahead. + Moreover, handle that internally instead of returning 1 to tell + the caller to do it. With that meaning of 1 gone, renumber + return codes more usefully. + (yyparse, yypush_parse): Update yysyntax_error usage. Most + importantly, set yytoken to YYEMPTY when there's no lookahead. + * data/glr.c (yyreportSyntaxError): As in yacc.c, don't drop the + unexpected token unless there actually is no lookahead. + * data/lalr1.cc (yy::parser::parse): If there's no lookahead, + set yytoken to yyempty_ before invoking yysyntax_error_. + (yy::parser::yysyntax_error_): Again, don't drop the unexpected + token unless there actually is no lookahead. + * data/lalr1.java (YYParser::parse): If there's no lookahead, + set yytoken to yyempty_ before invoking yysyntax_error. + (YYParser::yysyntax_error): Again, don't drop the unexpected + token unless there actually is no lookahead. + * tests/conflicts.at (%error-verbose and consistent + errors): Extend test group to further reveal how the previous + use of the simple "syntax error" message was too general. Test + yacc.c, glr.c, lalr1.cc, and lalr1.java. No longer an expected + failure. + * tests/java.at (AT_JAVA_COMPILE, AT_JAVA_PARSER_CHECK): Move + to... + * tests/local.at: ... here. + (_AT_BISON_OPTION_PUSHDEFS): Push AT_SKEL_JAVA_IF definition. + (AT_BISON_OPTION_POPDEFS): Pop it. + (AT_FULL_COMPILE): Extend to handle Java. + 2010-11-07 Joel E. Denny yysyntax_error: more preparation for readability of next patch. diff --git a/NEWS b/NEWS index ee80d567..2f0711ea 100644 --- a/NEWS +++ b/NEWS @@ -164,14 +164,41 @@ Bison News Bison now warns when a character literal is not of length one. In some future release, Bison will report an error instead. -** Verbose error messages fixed for nonassociative tokens. - - When %error-verbose is specified, syntax error messages produced by - the generated parser include the unexpected token as well as a list of - expected tokens. Previously, this list erroneously included tokens - that would actually induce a syntax error because conflicts for them - were resolved with %nonassoc. Such tokens are now properly omitted - from the list. +** Verbose syntax error message fixes: + + When %error-verbose or `#define YYERROR_VERBOSE' is specified, syntax + error messages produced by the generated parser include the unexpected + token as well as a list of expected tokens. The effect of %nonassoc + on these verbose messages has been corrected in two ways, but + additional fixes are still being implemented: + +*** When %nonassoc is used, there can exist parser states that accept no + tokens, and so the parser does not always require a lookahead token + in order to detect a syntax error. Because no unexpected token or + expected tokens can then be reported, the verbose syntax error + message described above is suppressed, and the parser instead + reports the simpler message, "syntax error". Previously, this + suppression was sometimes erroneously triggered by %nonassoc when a + lookahead was actually required. Now verbose messages are + suppressed only when all previous lookaheads have already been + shifted or discarded. + +*** Previously, the list of expected tokens erroneously included tokens + that would actually induce a syntax error because conflicts for them + were resolved with %nonassoc in the current parser state. Such + tokens are now properly omitted from the list. + +*** Expected token lists are still often wrong due to state merging + (from LALR or IELR) and default reductions, which can both add and + subtract valid tokens. Canonical LR almost completely fixes this + problem by eliminating state merging and default reductions. + However, there is one minor problem left even when using canonical + LR and even after the fixes above. That is, if the resolution of a + conflict with %nonassoc appears in a later parser state than the one + at which some syntax error is discovered, the conflicted token is + still erroneously included in the expected token list. We are + currently working on a fix to eliminate this problem and to + eliminate the need for canonical LR. ** Destructor calls fixed for lookaheads altered in semantic actions. diff --git a/data/glr.c b/data/glr.c index 01222a5c..4b14b79f 100644 --- a/data/glr.c +++ b/data/glr.c @@ -2094,11 +2094,7 @@ yyreportSyntaxError (yyGLRStack* yystackp]b4_user_formals[) #if ! YYERROR_VERBOSE yyerror (]b4_lyyerror_args[YY_("syntax error")); #else - int yyn; - yyn = yypact[yystackp->yytops.yystates[0]->yylrState]; -if (YYPACT_NINF < yyn && yyn <= YYLAST) - { - yySymbol yytoken = YYTRANSLATE (yychar); + yySymbol yytoken = yychar == YYEMPTY ? YYEMPTY : YYTRANSLATE (yychar); size_t yysize0 = yytnamerr (NULL, yytokenName (yytoken)); size_t yysize = yysize0; size_t yysize1; @@ -2109,23 +2105,47 @@ if (YYPACT_NINF < yyn && yyn <= YYLAST) const char *yyformat = 0; /* Arguments of yyformat. */ char const *yyarg[YYERROR_VERBOSE_ARGS_MAXIMUM]; + /* Number of reported tokens (one for the "unexpected", one per + "expected"). */ + int yycount = 0; + /* There are many possibilities here to consider: + - If this state is a consistent state with a default action, then + the only way this function was invoked is if the default action + is an error action. In that case, don't check for expected + tokens because there are none. + - The only way there can be no lookahead present (in yychar) is if + this state is a consistent state with a default action. Thus, + detecting the absence of a lookahead is sufficient to determine + that there is no unexpected or expected token to report. In that + case, just report a simple "syntax error". + - Don't assume there isn't a lookahead just because this state is a + consistent state with a default action. There might have been a + previous inconsistent state, consistent state with a non-default + action, or user semantic action that manipulated yychar. + - Of course, the expected token list depends on states to have + correct lookahead information, and it depends on the parser not + to perform extra reductions after fetching a lookahead from the + scanner and before detecting a syntax error. Thus, state merging + (from LALR or IELR) and default reductions corrupt the expected + token list. However, the list is correct for canonical LR with + one exception: it will still contain any token that will not be + accepted due to an error action in a later state. + */ + if (yytoken != YYEMPTY) + { + int yyn = yypact[yystackp->yytops.yystates[0]->yylrState]; + yyarg[yycount++] = yytokenName (yytoken); + if (!yypact_value_is_default (yyn)) + { /* Start YYX at -YYN if negative to avoid negative indexes in YYCHECK. In other words, skip the first -YYN actions for this state because they are default actions. */ int yyxbegin = yyn < 0 ? -yyn : 0; - /* Stay within bounds of both yycheck and yytname. */ int yychecklim = YYLAST - yyn + 1; int yyxend = yychecklim < YYNTOKENS ? yychecklim : YYNTOKENS; - - /* Number of reported tokens (one for the "unexpected", one per - "expected"). */ - int yycount = 0; int yyx; - - yyarg[yycount++] = yytokenName (yytoken); - for (yyx = yyxbegin; yyx < yyxend; ++yyx) if (yycheck[yyx + yyn] == yyx && yyx != YYTERROR && !yytable_value_is_error (yytable[yyx + yyn])) @@ -2141,6 +2161,8 @@ if (YYPACT_NINF < yyn && yyn <= YYLAST) yysize_overflow |= yysize1 < yysize; yysize = yysize1; } + } + } switch (yycount) { @@ -2148,6 +2170,7 @@ if (YYPACT_NINF < yyn && yyn <= YYLAST) case N: \ yyformat = S; \ break + YYCASE_(0, YY_("syntax error")); YYCASE_(1, YY_("syntax error, unexpected %s")); YYCASE_(2, YY_("syntax error, unexpected %s, expecting %s")); YYCASE_(3, YY_("syntax error, unexpected %s, expecting %s or %s")); @@ -2188,9 +2211,6 @@ if (YYPACT_NINF < yyn && yyn <= YYLAST) yyerror (]b4_lyyerror_args[YY_("syntax error")); yyMemoryExhausted (yystackp); } - } -else - yyerror (]b4_lyyerror_args[YY_("syntax error")); #endif /* YYERROR_VERBOSE */ yynerrs += 1; } diff --git a/data/lalr1.cc b/data/lalr1.cc index 206aee99..33ab08ee 100644 --- a/data/lalr1.cc +++ b/data/lalr1.cc @@ -719,6 +719,8 @@ m4_ifdef([b4_lex_param], [, ]b4_lex_param))[; if (!yyerrstatus_) { ++yynerrs_; + if (yychar == yyempty_) + yytoken = yyempty_; error (yylloc, yysyntax_error_ (yystate, yytoken)); } @@ -848,26 +850,52 @@ b4_error_verbose_if([int yystate, int yytoken], [int, int])[) {]b4_error_verbose_if([[ std::string yyres; - int yyn = yypact_[yystate]; - if (yypact_ninf_ < yyn && yyn <= yylast_) + // Number of reported tokens (one for the "unexpected", one per + // "expected"). + size_t yycount = 0; + // Its maximum. + enum { YYERROR_VERBOSE_ARGS_MAXIMUM = 5 }; + // Arguments of yyformat. + char const *yyarg[YYERROR_VERBOSE_ARGS_MAXIMUM]; + + /* There are many possibilities here to consider: + - If this state is a consistent state with a default action, then + the only way this function was invoked is if the default action + is an error action. In that case, don't check for expected + tokens because there are none. + - The only way there can be no lookahead present (in yytoken) is + if this state is a consistent state with a default action. + Thus, detecting the absence of a lookahead is sufficient to + determine that there is no unexpected or expected token to + report. In that case, just report a simple "syntax error". + - Don't assume there isn't a lookahead just because this state is + a consistent state with a default action. There might have + been a previous inconsistent state, consistent state with a + non-default action, or user semantic action that manipulated + yychar. + - Of course, the expected token list depends on states to have + correct lookahead information, and it depends on the parser not + to perform extra reductions after fetching a lookahead from the + scanner and before detecting a syntax error. Thus, state + merging (from LALR or IELR) and default reductions corrupt the + expected token list. However, the list is correct for + canonical LR with one exception: it will still contain any + token that will not be accepted due to an error action in a + later state. + */ + if (yytoken != yyempty_) { + yyarg[yycount++] = yytname_[yytoken]; + int yyn = yypact_[yystate]; + if (!yy_pact_value_is_default_ (yyn)) + { /* Start YYX at -YYN if negative to avoid negative indexes in YYCHECK. In other words, skip the first -YYN actions for this state because they are default actions. */ int yyxbegin = yyn < 0 ? -yyn : 0; - /* Stay within bounds of both yycheck and yytname. */ int yychecklim = yylast_ - yyn + 1; int yyxend = yychecklim < yyntokens_ ? yychecklim : yyntokens_; - - // Number of reported tokens (one for the "unexpected", one per - // "expected"). - size_t yycount = 0; - // Its maximum. - enum { YYERROR_VERBOSE_ARGS_MAXIMUM = 5 }; - // Arguments of yyformat. - char const *yyarg[YYERROR_VERBOSE_ARGS_MAXIMUM]; - yyarg[yycount++] = yytname_[yytoken]; for (int yyx = yyxbegin; yyx < yyxend; ++yyx) if (yycheck_[yyx + yyn] == yyx && yyx != yyterror_ && !yy_table_value_is_error_ (yytable_[yyx + yyn])) @@ -880,6 +908,8 @@ b4_error_verbose_if([int yystate, int yytoken], else yyarg[yycount++] = yytname_[yyx]; } + } + } char const* yyformat = 0; switch (yycount) @@ -888,6 +918,7 @@ b4_error_verbose_if([int yystate, int yytoken], case N: \ yyformat = S; \ break + YYCASE_(0, YY_("syntax error")); YYCASE_(1, YY_("syntax error, unexpected %s")); YYCASE_(2, YY_("syntax error, unexpected %s, expecting %s")); YYCASE_(3, YY_("syntax error, unexpected %s, expecting %s or %s")); @@ -895,6 +926,7 @@ b4_error_verbose_if([int yystate, int yytoken], YYCASE_(5, YY_("syntax error, unexpected %s, expecting %s or %s or %s or %s")); #undef YYCASE_ } + // Argument number. size_t yyi = 0; for (char const* yyp = yyformat; *yyp; ++yyp) @@ -905,9 +937,6 @@ b4_error_verbose_if([int yystate, int yytoken], } else yyres += *yyp; - } - else - yyres = YY_("syntax error"); return yyres;]], [[ return YY_("syntax error");]])[ } diff --git a/data/lalr1.java b/data/lalr1.java index 32462d78..76460ead 100644 --- a/data/lalr1.java +++ b/data/lalr1.java @@ -582,8 +582,10 @@ m4_popdef([b4_at_dollar])])dnl /* If not already recovering from an error, report this error. */ if (yyerrstatus_ == 0) { - ++yynerrs_; - yyerror (]b4_locations_if([yylloc, ])[yysyntax_error (yystate, yytoken)); + ++yynerrs_; + if (yychar == yyempty_) + yytoken = yyempty_; + yyerror (]b4_locations_if([yylloc, ])[yysyntax_error (yystate, yytoken)); } ]b4_locations_if([yyerrloc = yylloc;])[ @@ -683,17 +685,52 @@ m4_popdef([b4_at_dollar])])dnl { if (errorVerbose) { - int yyn = yypact_[yystate]; - if (yypact_ninf_ < yyn && yyn <= yylast_) + /* There are many possibilities here to consider: + - Assume YYFAIL is not used. It's too flawed to consider. + See + + for details. YYERROR is fine as it does not invoke this + function. + - If this state is a consistent state with a default action, + then the only way this function was invoked is if the + default action is an error action. In that case, don't + check for expected tokens because there are none. + - The only way there can be no lookahead present (in tok) is + if this state is a consistent state with a default action. + Thus, detecting the absence of a lookahead is sufficient to + determine that there is no unexpected or expected token to + report. In that case, just report a simple "syntax error". + - Don't assume there isn't a lookahead just because this + state is a consistent state with a default action. There + might have been a previous inconsistent state, consistent + state with a non-default action, or user semantic action + that manipulated yychar. (However, yychar is currently out + of scope during semantic actions.) + - Of course, the expected token list depends on states to + have correct lookahead information, and it depends on the + parser not to perform extra reductions after fetching a + lookahead from the scanner and before detecting a syntax + error. Thus, state merging (from LALR or IELR) and default + reductions corrupt the expected token list. However, the + list is correct for canonical LR with one exception: it + will still contain any token that will not be accepted due + to an error action in a later state. + */ + if (tok != yyempty_) { - StringBuffer res; - + // FIXME: This method of building the message is not compatible + // with internationalization. + StringBuffer res = + new StringBuffer ("syntax error, unexpected "); + res.append (yytnamerr_ (yytname_[tok])); + int yyn = yypact_[yystate]; + if (!yy_pact_value_is_default_ (yyn)) + { /* Start YYX at -YYN if negative to avoid negative indexes in YYCHECK. In other words, skip the first -YYN actions for this state because they are default actions. */ int yyxbegin = yyn < 0 ? -yyn : 0; - /* Stay within bounds of both yycheck and yytname. */ int yychecklim = yylast_ - yyn + 1; int yyxend = yychecklim < yyntokens_ ? yychecklim : yyntokens_; @@ -702,11 +739,6 @@ m4_popdef([b4_at_dollar])])dnl if (yycheck_[x + yyn] == x && x != yyterror_ && !yy_table_value_is_error_ (yytable_[x + yyn])) ++count; - - // FIXME: This method of building the message is not compatible - // with internationalization. - res = new StringBuffer ("syntax error, unexpected "); - res.append (yytnamerr_ (yytname_[tok])); if (count < 5) { count = 0; @@ -718,6 +750,7 @@ m4_popdef([b4_at_dollar])])dnl res.append (yytnamerr_ (yytname_[x])); } } + } return res.toString (); } } diff --git a/data/yacc.c b/data/yacc.c index de3d2cdf..bf804a40 100644 --- a/data/yacc.c +++ b/data/yacc.c @@ -972,20 +972,13 @@ yytnamerr (char *yyres, const char *yystr) /* Copy into *YYMSG, which is of size *YYMSG_ALLOC, an error message about the unexpected token YYTOKEN while in state YYSTATE. - Return 0 if *YYMSG was successfully written. Return 1 if an ordinary - "syntax error" message will suffice instead. Return 2 if *YYMSG is - not large enough to hold the message. In the last case, also set - *YYMSG_ALLOC to either (a) the required number of bytes or (b) zero - if the required number of bytes is too large to store. */ + Return 0 if *YYMSG was successfully written. Return 1 if *YYMSG is + not large enough to hold the message. In that case, also set + *YYMSG_ALLOC to the required number of bytes. Return 2 if the + required number of bytes is too large to store. */ static int yysyntax_error (YYSIZE_T *yymsg_alloc, char **yymsg, int yystate, int yytoken) -{ - int yyn = yypact[yystate]; - - if (! (YYPACT_NINF < yyn && yyn <= YYLAST)) - return 1; - else { YYSIZE_T yysize0 = yytnamerr (0, yytname[yytoken]); YYSIZE_T yysize = yysize0; @@ -995,22 +988,51 @@ yysyntax_error (YYSIZE_T *yymsg_alloc, char **yymsg, const char *yyformat = 0; /* Arguments of yyformat. */ char const *yyarg[YYERROR_VERBOSE_ARGS_MAXIMUM]; + /* Number of reported tokens (one for the "unexpected", one per + "expected"). */ + int yycount = 0; + /* There are many possibilities here to consider: + - Assume YYFAIL is not used. It's too flawed to consider. See + + for details. YYERROR is fine as it does not invoke this + function. + - If this state is a consistent state with a default action, then + the only way this function was invoked is if the default action + is an error action. In that case, don't check for expected + tokens because there are none. + - The only way there can be no lookahead present (in yychar) is if + this state is a consistent state with a default action. Thus, + detecting the absence of a lookahead is sufficient to determine + that there is no unexpected or expected token to report. In that + case, just report a simple "syntax error". + - Don't assume there isn't a lookahead just because this state is a + consistent state with a default action. There might have been a + previous inconsistent state, consistent state with a non-default + action, or user semantic action that manipulated yychar. + - Of course, the expected token list depends on states to have + correct lookahead information, and it depends on the parser not + to perform extra reductions after fetching a lookahead from the + scanner and before detecting a syntax error. Thus, state merging + (from LALR or IELR) and default reductions corrupt the expected + token list. However, the list is correct for canonical LR with + one exception: it will still contain any token that will not be + accepted due to an error action in a later state. + */ + if (yytoken != YYEMPTY) + { + int yyn = yypact[yystate]; + yyarg[yycount++] = yytname[yytoken]; + if (!yypact_value_is_default (yyn)) + { /* Start YYX at -YYN if negative to avoid negative indexes in YYCHECK. In other words, skip the first -YYN actions for this state because they are default actions. */ int yyxbegin = yyn < 0 ? -yyn : 0; - /* Stay within bounds of both yycheck and yytname. */ int yychecklim = YYLAST - yyn + 1; int yyxend = yychecklim < YYNTOKENS ? yychecklim : YYNTOKENS; - /* Number of reported tokens (one for the "unexpected", one per - "expected"). */ - int yycount = 0; int yyx; - - yyarg[yycount++] = yytname[yytoken]; - for (yyx = yyxbegin; yyx < yyxend; ++yyx) if (yycheck[yyx + yyn] == yyx && yyx != YYTERROR && !yytable_value_is_error (yytable[yyx + yyn])) @@ -1023,14 +1045,13 @@ yysyntax_error (YYSIZE_T *yymsg_alloc, char **yymsg, } yyarg[yycount++] = yytname[yyx]; yysize1 = yysize + yytnamerr (0, yytname[yyx]); - if (! (yysize <= yysize1 && yysize1 <= YYSTACK_ALLOC_MAXIMUM)) - { - /* Overflow. */ - *yymsg_alloc = 0; - return 2; - } + if (! (yysize <= yysize1 + && yysize1 <= YYSTACK_ALLOC_MAXIMUM)) + return 2; yysize = yysize1; } + } + } switch (yycount) { @@ -1038,6 +1059,7 @@ yysyntax_error (YYSIZE_T *yymsg_alloc, char **yymsg, case N: \ yyformat = S; \ break + YYCASE_(0, YY_("syntax error")); YYCASE_(1, YY_("syntax error, unexpected %s")); YYCASE_(2, YY_("syntax error, unexpected %s, expecting %s")); YYCASE_(3, YY_("syntax error, unexpected %s, expecting %s or %s")); @@ -1048,11 +1070,7 @@ yysyntax_error (YYSIZE_T *yymsg_alloc, char **yymsg, yysize1 = yysize + yystrlen (yyformat); if (! (yysize <= yysize1 && yysize1 <= YYSTACK_ALLOC_MAXIMUM)) - { - /* Overflow. */ - *yymsg_alloc = 0; - return 2; - } + return 2; yysize = yysize1; if (*yymsg_alloc < yysize) @@ -1061,7 +1079,7 @@ yysyntax_error (YYSIZE_T *yymsg_alloc, char **yymsg, if (! (yysize <= *yymsg_alloc && *yymsg_alloc <= YYSTACK_ALLOC_MAXIMUM)) *yymsg_alloc = YYSTACK_ALLOC_MAXIMUM; - return 2; + return 1; } /* Avoid sprintf, as that infringes on the user's name space. @@ -1084,7 +1102,6 @@ yysyntax_error (YYSIZE_T *yymsg_alloc, char **yymsg, } return 0; } -} #endif /* YYERROR_VERBOSE */ @@ -1533,7 +1550,7 @@ yyreduce: yyerrlab: /* Make sure we have latest lookahead translation. See comments at user semantic actions for why this is necessary. */ - yytoken = YYTRANSLATE (yychar); + yytoken = yychar == YYEMPTY ? YYEMPTY : YYTRANSLATE (yychar); /* If not already recovering from an error, report this error. */ if (!yyerrstatus) @@ -1549,7 +1566,7 @@ yyerrlab: int yysyntax_error_status = YYSYNTAX_ERROR; if (yysyntax_error_status == 0) yymsgp = yymsg; - else if (yysyntax_error_status == 2 && 0 < yymsg_alloc) + else if (yysyntax_error_status == 1) { if (yymsg != yymsgbuf) YYSTACK_FREE (yymsg); @@ -1558,6 +1575,7 @@ yyerrlab: { yymsg = yymsgbuf; yymsg_alloc = sizeof yymsgbuf; + yysyntax_error_status = 2; } else { diff --git a/src/parse-gram.c b/src/parse-gram.c index e16106d2..a06602de 100644 --- a/src/parse-gram.c +++ b/src/parse-gram.c @@ -1446,20 +1446,13 @@ yytnamerr (char *yyres, const char *yystr) /* Copy into *YYMSG, which is of size *YYMSG_ALLOC, an error message about the unexpected token YYTOKEN while in state YYSTATE. - Return 0 if *YYMSG was successfully written. Return 1 if an ordinary - "syntax error" message will suffice instead. Return 2 if *YYMSG is - not large enough to hold the message. In the last case, also set - *YYMSG_ALLOC to either (a) the required number of bytes or (b) zero - if the required number of bytes is too large to store. */ + Return 0 if *YYMSG was successfully written. Return 1 if *YYMSG is + not large enough to hold the message. In that case, also set + *YYMSG_ALLOC to the required number of bytes. Return 2 if the + required number of bytes is too large to store. */ static int yysyntax_error (YYSIZE_T *yymsg_alloc, char **yymsg, int yystate, int yytoken) -{ - int yyn = yypact[yystate]; - - if (! (YYPACT_NINF < yyn && yyn <= YYLAST)) - return 1; - else { YYSIZE_T yysize0 = yytnamerr (0, yytname[yytoken]); YYSIZE_T yysize = yysize0; @@ -1469,22 +1462,51 @@ yysyntax_error (YYSIZE_T *yymsg_alloc, char **yymsg, const char *yyformat = 0; /* Arguments of yyformat. */ char const *yyarg[YYERROR_VERBOSE_ARGS_MAXIMUM]; + /* Number of reported tokens (one for the "unexpected", one per + "expected"). */ + int yycount = 0; + /* There are many possibilities here to consider: + - Assume YYFAIL is not used. It's too flawed to consider. See + + for details. YYERROR is fine as it does not invoke this + function. + - If this state is a consistent state with a default action, then + the only way this function was invoked is if the default action + is an error action. In that case, don't check for expected + tokens because there are none. + - The only way there can be no lookahead present (in yychar) is if + this state is a consistent state with a default action. Thus, + detecting the absence of a lookahead is sufficient to determine + that there is no unexpected or expected token to report. In that + case, just report a simple "syntax error". + - Don't assume there isn't a lookahead just because this state is a + consistent state with a default action. There might have been a + previous inconsistent state, consistent state with a non-default + action, or user semantic action that manipulated yychar. + - Of course, the expected token list depends on states to have + correct lookahead information, and it depends on the parser not + to perform extra reductions after fetching a lookahead from the + scanner and before detecting a syntax error. Thus, state merging + (from LALR or IELR) and default reductions corrupt the expected + token list. However, the list is correct for canonical LR with + one exception: it will still contain any token that will not be + accepted due to an error action in a later state. + */ + if (yytoken != YYEMPTY) + { + int yyn = yypact[yystate]; + yyarg[yycount++] = yytname[yytoken]; + if (!yypact_value_is_default (yyn)) + { /* Start YYX at -YYN if negative to avoid negative indexes in YYCHECK. In other words, skip the first -YYN actions for this state because they are default actions. */ int yyxbegin = yyn < 0 ? -yyn : 0; - /* Stay within bounds of both yycheck and yytname. */ int yychecklim = YYLAST - yyn + 1; int yyxend = yychecklim < YYNTOKENS ? yychecklim : YYNTOKENS; - /* Number of reported tokens (one for the "unexpected", one per - "expected"). */ - int yycount = 0; int yyx; - - yyarg[yycount++] = yytname[yytoken]; - for (yyx = yyxbegin; yyx < yyxend; ++yyx) if (yycheck[yyx + yyn] == yyx && yyx != YYTERROR && !yytable_value_is_error (yytable[yyx + yyn])) @@ -1497,14 +1519,13 @@ yysyntax_error (YYSIZE_T *yymsg_alloc, char **yymsg, } yyarg[yycount++] = yytname[yyx]; yysize1 = yysize + yytnamerr (0, yytname[yyx]); - if (! (yysize <= yysize1 && yysize1 <= YYSTACK_ALLOC_MAXIMUM)) - { - /* Overflow. */ - *yymsg_alloc = 0; - return 2; - } + if (! (yysize <= yysize1 + && yysize1 <= YYSTACK_ALLOC_MAXIMUM)) + return 2; yysize = yysize1; } + } + } switch (yycount) { @@ -1512,6 +1533,7 @@ yysyntax_error (YYSIZE_T *yymsg_alloc, char **yymsg, case N: \ yyformat = S; \ break + YYCASE_(0, YY_("syntax error")); YYCASE_(1, YY_("syntax error, unexpected %s")); YYCASE_(2, YY_("syntax error, unexpected %s, expecting %s")); YYCASE_(3, YY_("syntax error, unexpected %s, expecting %s or %s")); @@ -1522,11 +1544,7 @@ yysyntax_error (YYSIZE_T *yymsg_alloc, char **yymsg, yysize1 = yysize + yystrlen (yyformat); if (! (yysize <= yysize1 && yysize1 <= YYSTACK_ALLOC_MAXIMUM)) - { - /* Overflow. */ - *yymsg_alloc = 0; - return 2; - } + return 2; yysize = yysize1; if (*yymsg_alloc < yysize) @@ -1535,7 +1553,7 @@ yysyntax_error (YYSIZE_T *yymsg_alloc, char **yymsg, if (! (yysize <= *yymsg_alloc && *yymsg_alloc <= YYSTACK_ALLOC_MAXIMUM)) *yymsg_alloc = YYSTACK_ALLOC_MAXIMUM; - return 2; + return 1; } /* Avoid sprintf, as that infringes on the user's name space. @@ -1558,7 +1576,6 @@ yysyntax_error (YYSIZE_T *yymsg_alloc, char **yymsg, } return 0; } -} #endif /* YYERROR_VERBOSE */ @@ -1733,7 +1750,7 @@ YYLTYPE yylloc; /* User initialization code. */ -/* Line 1279 of yacc.c */ +/* Line 1296 of yacc.c */ #line 86 "parse-gram.y" { /* Bison's grammar can initial empty locations, hence a default @@ -1742,8 +1759,8 @@ YYLTYPE yylloc; boundary_set (&yylloc.end, current_file, 1, 1); } -/* Line 1279 of yacc.c */ -#line 1747 "parse-gram.c" +/* Line 1296 of yacc.c */ +#line 1764 "parse-gram.c" yylsp[0] = yylloc; goto yysetstate; @@ -1930,7 +1947,7 @@ yyreduce: { case 6: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 225 "parse-gram.y" { code_props plain_code; @@ -1945,14 +1962,14 @@ yyreduce: case 7: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 234 "parse-gram.y" { debug_flag = true; } break; case 8: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 236 "parse-gram.y" { muscle_percent_define_insert ((yyvsp[(2) - (3)].uniqstr), (yylsp[(2) - (3)]), (yyvsp[(3) - (3)].chars), @@ -1962,14 +1979,14 @@ yyreduce: case 9: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 240 "parse-gram.y" { defines_flag = true; } break; case 10: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 242 "parse-gram.y" { defines_flag = true; @@ -1979,42 +1996,42 @@ yyreduce: case 11: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 246 "parse-gram.y" { error_verbose = true; } break; case 12: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 247 "parse-gram.y" { expected_sr_conflicts = (yyvsp[(2) - (2)].integer); } break; case 13: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 248 "parse-gram.y" { expected_rr_conflicts = (yyvsp[(2) - (2)].integer); } break; case 14: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 249 "parse-gram.y" { spec_file_prefix = (yyvsp[(2) - (2)].chars); } break; case 15: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 250 "parse-gram.y" { spec_file_prefix = (yyvsp[(3) - (3)].chars); } break; case 16: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 252 "parse-gram.y" { nondeterministic_parser = true; @@ -2024,7 +2041,7 @@ yyreduce: case 17: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 257 "parse-gram.y" { code_props action; @@ -2038,77 +2055,77 @@ yyreduce: case 18: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 265 "parse-gram.y" { language_argmatch ((yyvsp[(2) - (2)].chars), grammar_prio, (yylsp[(1) - (2)])); } break; case 19: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 266 "parse-gram.y" { add_param ("lex_param", (yyvsp[(2) - (2)].code), (yylsp[(2) - (2)])); } break; case 20: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 267 "parse-gram.y" { locations_flag = true; } break; case 21: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 268 "parse-gram.y" { spec_name_prefix = (yyvsp[(2) - (2)].chars); } break; case 22: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 269 "parse-gram.y" { spec_name_prefix = (yyvsp[(3) - (3)].chars); } break; case 23: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 270 "parse-gram.y" { no_lines_flag = true; } break; case 24: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 271 "parse-gram.y" { nondeterministic_parser = true; } break; case 25: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 272 "parse-gram.y" { spec_outfile = (yyvsp[(2) - (2)].chars); } break; case 26: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 273 "parse-gram.y" { spec_outfile = (yyvsp[(3) - (3)].chars); } break; case 27: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 274 "parse-gram.y" { add_param ("parse_param", (yyvsp[(2) - (2)].code), (yylsp[(2) - (2)])); } break; case 28: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 276 "parse-gram.y" { /* %pure-parser is deprecated in favor of `%define api.pure', so use @@ -2128,14 +2145,14 @@ yyreduce: case 29: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 290 "parse-gram.y" { version_check (&(yylsp[(2) - (2)]), (yyvsp[(2) - (2)].chars)); } break; case 30: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 292 "parse-gram.y" { char const *skeleton_user = (yyvsp[(2) - (2)].chars); @@ -2164,28 +2181,28 @@ yyreduce: case 31: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 315 "parse-gram.y" { token_table_flag = true; } break; case 32: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 316 "parse-gram.y" { report_flag |= report_states; } break; case 33: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 317 "parse-gram.y" { yacc_flag = true; } break; case 37: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 325 "parse-gram.y" { grammar_start_symbol_set ((yyvsp[(2) - (2)].symbol), (yylsp[(2) - (2)])); @@ -2194,7 +2211,7 @@ yyreduce: case 38: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 329 "parse-gram.y" { symbol_list *list; @@ -2206,7 +2223,7 @@ yyreduce: case 39: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 336 "parse-gram.y" { symbol_list *list; @@ -2218,7 +2235,7 @@ yyreduce: case 40: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 343 "parse-gram.y" { default_prec = true; @@ -2227,7 +2244,7 @@ yyreduce: case 41: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 347 "parse-gram.y" { default_prec = false; @@ -2236,7 +2253,7 @@ yyreduce: case 42: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 351 "parse-gram.y" { /* Do not invoke muscle_percent_code_grow here since it invokes @@ -2248,7 +2265,7 @@ yyreduce: case 43: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 358 "parse-gram.y" { muscle_percent_code_grow ((yyvsp[(2) - (3)].uniqstr), (yylsp[(2) - (3)]), (yyvsp[(3) - (3)].chars), (yylsp[(3) - (3)])); @@ -2258,21 +2275,21 @@ yyreduce: case 44: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 372 "parse-gram.y" {} break; case 45: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 373 "parse-gram.y" { muscle_code_grow ("union_name", (yyvsp[(1) - (1)].uniqstr), (yylsp[(1) - (1)])); } break; case 46: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 378 "parse-gram.y" { union_seen = true; @@ -2283,14 +2300,14 @@ yyreduce: case 47: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 389 "parse-gram.y" { current_class = nterm_sym; } break; case 48: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 390 "parse-gram.y" { current_class = unknown_sym; @@ -2300,14 +2317,14 @@ yyreduce: case 49: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 394 "parse-gram.y" { current_class = token_sym; } break; case 50: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 395 "parse-gram.y" { current_class = unknown_sym; @@ -2317,7 +2334,7 @@ yyreduce: case 51: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 400 "parse-gram.y" { symbol_list *list; @@ -2330,7 +2347,7 @@ yyreduce: case 52: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 411 "parse-gram.y" { symbol_list *list; @@ -2347,126 +2364,126 @@ yyreduce: case 53: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 425 "parse-gram.y" { (yyval.assoc) = left_assoc; } break; case 54: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 426 "parse-gram.y" { (yyval.assoc) = right_assoc; } break; case 55: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 427 "parse-gram.y" { (yyval.assoc) = non_assoc; } break; case 56: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 431 "parse-gram.y" { current_type = NULL; } break; case 57: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 432 "parse-gram.y" { current_type = (yyvsp[(1) - (1)].uniqstr); tag_seen = true; } break; case 58: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 438 "parse-gram.y" { (yyval.list) = symbol_list_sym_new ((yyvsp[(1) - (1)].symbol), (yylsp[(1) - (1)])); } break; case 59: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 440 "parse-gram.y" { (yyval.list) = symbol_list_prepend ((yyvsp[(1) - (2)].list), symbol_list_sym_new ((yyvsp[(2) - (2)].symbol), (yylsp[(2) - (2)]))); } break; case 60: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 444 "parse-gram.y" { (yyval.symbol) = (yyvsp[(1) - (1)].symbol); } break; case 61: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 445 "parse-gram.y" { (yyval.symbol) = (yyvsp[(1) - (2)].symbol); symbol_user_token_number_set ((yyvsp[(1) - (2)].symbol), (yyvsp[(2) - (2)].integer), (yylsp[(2) - (2)])); } break; case 62: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 451 "parse-gram.y" { (yyval.list) = symbol_list_sym_new ((yyvsp[(1) - (1)].symbol), (yylsp[(1) - (1)])); } break; case 63: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 453 "parse-gram.y" { (yyval.list) = symbol_list_prepend ((yyvsp[(1) - (2)].list), symbol_list_sym_new ((yyvsp[(2) - (2)].symbol), (yylsp[(2) - (2)]))); } break; case 64: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 457 "parse-gram.y" { (yyval.list) = (yyvsp[(1) - (1)].list); } break; case 65: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 458 "parse-gram.y" { (yyval.list) = symbol_list_prepend ((yyvsp[(1) - (2)].list), (yyvsp[(2) - (2)].list)); } break; case 66: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 462 "parse-gram.y" { (yyval.list) = symbol_list_sym_new ((yyvsp[(1) - (1)].symbol), (yylsp[(1) - (1)])); } break; case 67: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 463 "parse-gram.y" { (yyval.list) = symbol_list_type_new ((yyvsp[(1) - (1)].uniqstr), (yylsp[(1) - (1)])); } break; case 68: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 464 "parse-gram.y" { (yyval.list) = symbol_list_default_tagged_new ((yylsp[(1) - (1)])); } break; case 69: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 465 "parse-gram.y" { (yyval.list) = symbol_list_default_tagless_new ((yylsp[(1) - (1)])); } break; case 70: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 471 "parse-gram.y" { current_type = (yyvsp[(1) - (1)].uniqstr); @@ -2476,7 +2493,7 @@ yyreduce: case 71: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 476 "parse-gram.y" { symbol_class_set ((yyvsp[(1) - (1)].symbol), current_class, (yylsp[(1) - (1)]), true); @@ -2486,7 +2503,7 @@ yyreduce: case 72: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 481 "parse-gram.y" { symbol_class_set ((yyvsp[(1) - (2)].symbol), current_class, (yylsp[(1) - (2)]), true); @@ -2497,7 +2514,7 @@ yyreduce: case 73: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 487 "parse-gram.y" { symbol_class_set ((yyvsp[(1) - (2)].symbol), current_class, (yylsp[(1) - (2)]), true); @@ -2508,7 +2525,7 @@ yyreduce: case 74: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 493 "parse-gram.y" { symbol_class_set ((yyvsp[(1) - (3)].symbol), current_class, (yylsp[(1) - (3)]), true); @@ -2520,7 +2537,7 @@ yyreduce: case 81: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 523 "parse-gram.y" { yyerrok; @@ -2529,7 +2546,7 @@ yyreduce: case 82: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 529 "parse-gram.y" { current_lhs = (yyvsp[(1) - (2)].symbol); current_lhs_location = (yylsp[(1) - (2)]); current_lhs_named_ref = (yyvsp[(2) - (2)].named_ref); } @@ -2537,21 +2554,21 @@ yyreduce: case 84: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 534 "parse-gram.y" { grammar_current_rule_end ((yylsp[(1) - (1)])); } break; case 85: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 535 "parse-gram.y" { grammar_current_rule_end ((yylsp[(3) - (3)])); } break; case 87: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 541 "parse-gram.y" { grammar_current_rule_begin (current_lhs, current_lhs_location, current_lhs_named_ref); } @@ -2559,77 +2576,77 @@ yyreduce: case 88: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 544 "parse-gram.y" { grammar_current_rule_symbol_append ((yyvsp[(2) - (3)].symbol), (yylsp[(2) - (3)]), (yyvsp[(3) - (3)].named_ref)); } break; case 89: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 546 "parse-gram.y" { grammar_current_rule_action_append ((yyvsp[(2) - (3)].code), (yylsp[(2) - (3)]), (yyvsp[(3) - (3)].named_ref)); } break; case 90: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 548 "parse-gram.y" { grammar_current_rule_prec_set ((yyvsp[(3) - (3)].symbol), (yylsp[(3) - (3)])); } break; case 91: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 550 "parse-gram.y" { grammar_current_rule_dprec_set ((yyvsp[(3) - (3)].integer), (yylsp[(3) - (3)])); } break; case 92: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 552 "parse-gram.y" { grammar_current_rule_merge_set ((yyvsp[(3) - (3)].uniqstr), (yylsp[(3) - (3)])); } break; case 93: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 556 "parse-gram.y" { (yyval.named_ref) = 0; } break; case 94: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 558 "parse-gram.y" { (yyval.named_ref) = named_ref_new((yyvsp[(1) - (1)].uniqstr), (yylsp[(1) - (1)])); } break; case 96: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 570 "parse-gram.y" { (yyval.uniqstr) = uniqstr_new ((yyvsp[(1) - (1)].chars)); } break; case 97: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 575 "parse-gram.y" { (yyval.chars) = ""; } break; case 98: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 576 "parse-gram.y" { (yyval.chars) = (yyvsp[(1) - (1)].uniqstr); } break; case 100: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 587 "parse-gram.y" { code_props plain_code; @@ -2643,14 +2660,14 @@ yyreduce: case 101: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 607 "parse-gram.y" { (yyval.symbol) = symbol_from_uniqstr ((yyvsp[(1) - (1)].uniqstr), (yylsp[(1) - (1)])); } break; case 102: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 609 "parse-gram.y" { (yyval.symbol) = symbol_get (char_name ((yyvsp[(1) - (1)].character)), (yylsp[(1) - (1)])); @@ -2661,14 +2678,14 @@ yyreduce: case 103: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 617 "parse-gram.y" { (yyval.symbol) = symbol_from_uniqstr ((yyvsp[(1) - (1)].uniqstr), (yylsp[(1) - (1)])); } break; case 106: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 629 "parse-gram.y" { (yyval.symbol) = symbol_get (quotearg_style (c_quoting_style, (yyvsp[(1) - (1)].chars)), (yylsp[(1) - (1)])); @@ -2678,7 +2695,7 @@ yyreduce: case 108: -/* Line 1492 of yacc.c */ +/* Line 1509 of yacc.c */ #line 638 "parse-gram.y" { code_props plain_code; @@ -2692,8 +2709,8 @@ yyreduce: -/* Line 1492 of yacc.c */ -#line 2697 "parse-gram.c" +/* Line 1509 of yacc.c */ +#line 2714 "parse-gram.c" default: break; } /* User semantic actions sometimes alter yychar, and that requires @@ -2737,7 +2754,7 @@ yyreduce: yyerrlab: /* Make sure we have latest lookahead translation. See comments at user semantic actions for why this is necessary. */ - yytoken = YYTRANSLATE (yychar); + yytoken = yychar == YYEMPTY ? YYEMPTY : YYTRANSLATE (yychar); /* If not already recovering from an error, report this error. */ if (!yyerrstatus) @@ -2753,7 +2770,7 @@ yyerrlab: int yysyntax_error_status = YYSYNTAX_ERROR; if (yysyntax_error_status == 0) yymsgp = yymsg; - else if (yysyntax_error_status == 2 && 0 < yymsg_alloc) + else if (yysyntax_error_status == 1) { if (yymsg != yymsgbuf) YYSTACK_FREE (yymsg); @@ -2762,6 +2779,7 @@ yyerrlab: { yymsg = yymsgbuf; yymsg_alloc = sizeof yymsgbuf; + yysyntax_error_status = 2; } else { @@ -2928,7 +2946,7 @@ yyreturn: -/* Line 1729 of yacc.c */ +/* Line 1747 of yacc.c */ #line 648 "parse-gram.y" diff --git a/src/parse-gram.h b/src/parse-gram.h index 34d0c187..eba94171 100644 --- a/src/parse-gram.h +++ b/src/parse-gram.h @@ -161,7 +161,7 @@ typedef union YYSTYPE { -/* Line 1730 of yacc.c */ +/* Line 1748 of yacc.c */ #line 94 "parse-gram.y" symbol *symbol; @@ -176,7 +176,7 @@ typedef union YYSTYPE -/* Line 1730 of yacc.c */ +/* Line 1748 of yacc.c */ #line 181 "parse-gram.h" } YYSTYPE; # define YYSTYPE_IS_TRIVIAL 1 diff --git a/tests/conflicts.at b/tests/conflicts.at index c6671de7..1d4eb04e 100644 --- a/tests/conflicts.at +++ b/tests/conflicts.at @@ -147,15 +147,30 @@ AT_SETUP([[%error-verbose and consistent errors]]) m4_pushdef([AT_CONSISTENT_ERRORS_CHECK], [ -AT_DATA_GRAMMAR([input.y], -[[%code { +AT_BISON_OPTION_PUSHDEFS([$1]) + +m4_pushdef([AT_YYLEX_PROTOTYPE], +[AT_SKEL_CC_IF([[int yylex (yy::parser::semantic_type *lvalp)]], + [[int yylex (YYSTYPE *lvalp)]])]) + +AT_SKEL_JAVA_IF([AT_DATA], [AT_DATA_GRAMMAR])([input.y], +[AT_SKEL_JAVA_IF([[ + +%code imports { + import java.io.IOException; +}]], [[ + +%code {]AT_SKEL_CC_IF([[ + #include ]], [[ #include #include - int yylex (void); - void yyerror (char const *); + void yyerror (char const *msg);]])[ + ]AT_YYLEX_PROTOTYPE[; #define USE(Var) } +]AT_SKEL_CC_IF([[%defines]], [[%define api.pure]])])[ + ]$1[ %error-verbose @@ -164,63 +179,193 @@ AT_DATA_GRAMMAR([input.y], ]$2[ -%% +]AT_SKEL_JAVA_IF([[%code lexer {]], [[%%]])[ -int -yylex (void) +/*--------. +| yylex. | +`--------*/]AT_SKEL_JAVA_IF([[ + +public String input = "]$3["; +public int index = 0; +public int yylex () +{ + if (index < input.length ()) + return input.charAt (index++); + else + return 0; +} +public Object getLVal () +{ + return new Integer(1); +}]], [[ + +]AT_YYLEX_PROTOTYPE[ { static char const *input = "]$3["; - yylval = 1; + *lvalp = 1; return *input++; +}]])[ + +/*----------. +| yyerror. | +`----------*/]AT_SKEL_JAVA_IF([[ + +public void yyerror (String msg) +{ + System.err.println (msg); } +}; + +%%]], [AT_SKEL_CC_IF([[ + +void +yy::parser::error (const yy::location &, std::string const &msg) +{ + std::cerr << msg << std::endl; +}]], [[ + void yyerror (char const *msg) { fprintf (stderr, "%s\n", msg); -} +}]])])[ + +/*-------. +| main. | +`-------*/]AT_SKEL_JAVA_IF([[ + +class input +{ + public static void main (String args[]) throws IOException + { + YYParser p = new YYParser (); + p.parse (); + } +}]], [AT_SKEL_CC_IF([[ + +int +main (void) +{ + yy::parser parser; + return parser.parse (); +}]], [[ int main (void) { return yyparse (); -} +}]])])[ ]]) -AT_BISON_CHECK([[-o input.c input.y]]) -AT_COMPILE([[input]]) + +AT_FULL_COMPILE([[input]]) m4_pushdef([AT_EXPECTING], [m4_if($5, [ab], [[, expecting 'a' or 'b']], $5, [a], [[, expecting 'a']], $5, [b], [[, expecting 'b']])]) -AT_PARSER_CHECK([[./input]], [[1]], [[]], +AT_SKEL_JAVA_IF([AT_JAVA_PARSER_CHECK([[input]], [[0]]], + [AT_PARSER_CHECK([[./input]], [[1]]]), +[[]], [[syntax error, unexpected ]$4[]AT_EXPECTING[ ]]) m4_popdef([AT_EXPECTING]) +m4_popdef([AT_YYLEX_PROTOTYPE]) +AT_BISON_OPTION_POPDEFS ]) +m4_pushdef([AT_PREVIOUS_STATE_GRAMMAR], +[[%nonassoc 'a'; + +start: consistent-error-on-a-a 'a' ; + +consistent-error-on-a-a: + 'a' default-reduction + | 'a' default-reduction 'a' + | 'a' shift + ; + +default-reduction: /*empty*/ ; +shift: 'b' ; + +// Provide another context in which all rules are useful so that this +// test case looks a little more realistic. +start: 'b' consistent-error-on-a-a 'c' ; +]]) + +m4_pushdef([AT_PREVIOUS_STATE_INPUT], [[a]]) + +# Unfortunately, no expected tokens are reported even though 'b' can be +# accepted. Nevertheless, the main point of this test is to make sure +# that at least the unexpected token is reported. In a previous version +# of Bison, it wasn't reported because the error is detected in a +# consistent state with an error action, and that case always triggered +# the simple "syntax error" message. +# +# The point isn't to test IELR here, but state merging happens to +# complicate this example. +AT_CONSISTENT_ERRORS_CHECK([[%define lr.type ielr]], + [AT_PREVIOUS_STATE_GRAMMAR], + [AT_PREVIOUS_STATE_INPUT], + [[$end]], [[none]]) +AT_CONSISTENT_ERRORS_CHECK([[%define lr.type ielr + %glr-parser]], + [AT_PREVIOUS_STATE_GRAMMAR], + [AT_PREVIOUS_STATE_INPUT], + [[$end]], [[none]]) +AT_CONSISTENT_ERRORS_CHECK([[%define lr.type ielr + %language "c++"]], + [AT_PREVIOUS_STATE_GRAMMAR], + [AT_PREVIOUS_STATE_INPUT], + [[$end]], [[none]]) +AT_CONSISTENT_ERRORS_CHECK([[%define lr.type ielr + %language "java"]], + [AT_PREVIOUS_STATE_GRAMMAR], + [AT_PREVIOUS_STATE_INPUT], + [[end of input]], [[none]]) + +# Even canonical LR doesn't foresee the error for 'a'! +AT_CONSISTENT_ERRORS_CHECK([[%define lr.type ielr + %define lr.default-reductions consistent]], + [AT_PREVIOUS_STATE_GRAMMAR], + [AT_PREVIOUS_STATE_INPUT], + [[$end]], [[ab]]) +AT_CONSISTENT_ERRORS_CHECK([[%define lr.type ielr + %define lr.default-reductions accepting]], + [AT_PREVIOUS_STATE_GRAMMAR], + [AT_PREVIOUS_STATE_INPUT], + [[$end]], [[ab]]) +AT_CONSISTENT_ERRORS_CHECK([[%define lr.type canonical-lr]], + [AT_PREVIOUS_STATE_GRAMMAR], + [AT_PREVIOUS_STATE_INPUT], + [[$end]], [[ab]]) + +m4_popdef([AT_PREVIOUS_STATE_GRAMMAR]) +m4_popdef([AT_PREVIOUS_STATE_INPUT]) + m4_pushdef([AT_USER_ACTION_GRAMMAR], [[%nonassoc 'a'; -// If yylval=0 here, then we know that the 'a' destructor is being -// invoked incorrectly for the 'b' set in the semantic action below. -// All 'a' tokens are returned by yylex, which sets yylval=1. +// If $$ = 0 here, then we know that the 'a' destructor is being invoked +// incorrectly for the 'b' set in the semantic action below. All 'a' +// tokens are returned by yylex, which sets $$ = 1. %destructor { if (!$$) fprintf (stderr, "Wrong destructor.\n"); } 'a'; -// The lookahead assigned by the semantic action isn't needed before -// either error action is encountered. In a previous version of Bison, -// this was a problem as it meant yychar was not translated into yytoken -// before either error action. The second error action thus invoked a +// Rather than depend on an inconsistent state to induce reading a +// lookahead as in the previous grammar, just assign the lookahead in a +// semantic action. That lookahead isn't needed before either error +// action is encountered. In a previous version of Bison, this was a +// problem as it meant yychar was not translated into yytoken before +// either error action. The second error action thus invoked a // destructor that it selected according to the incorrect yytoken. The // first error action would have reported an incorrect unexpected token -// except that, due to another bug, the unexpected token is not reported -// at all because the error action is the default action in a consistent -// state. That bug still needs to be fixed. +// except that, due to the bug described in the previous grammar, the +// unexpected token was not reported at all. start: error-reduce consistent-error 'a' { USE ($][3); } ; error-reduce: @@ -247,13 +392,16 @@ start: 'b' consistent-error 'b' ; ]]) m4_pushdef([AT_USER_ACTION_INPUT], [[aa]]) -# See comments in grammar for why this test doesn't succeed. -AT_XFAIL_IF([[:]]) - AT_CONSISTENT_ERRORS_CHECK([[]], [AT_USER_ACTION_GRAMMAR], [AT_USER_ACTION_INPUT], [['b']], [[none]]) +AT_CONSISTENT_ERRORS_CHECK([[%glr-parser]], + [AT_USER_ACTION_GRAMMAR], + [AT_USER_ACTION_INPUT], + [['b']], [[none]]) +# No C++ or Java test because yychar cannot be manipulated by users. + AT_CONSISTENT_ERRORS_CHECK([[%define lr.default-reductions consistent]], [AT_USER_ACTION_GRAMMAR], [AT_USER_ACTION_INPUT], diff --git a/tests/java.at b/tests/java.at index e35588f0..577a84a2 100644 --- a/tests/java.at +++ b/tests/java.at @@ -219,24 +219,6 @@ m4_define([AT_DATA_JAVA_CALC_Y], ]) - -# AT_JAVA_COMPILE(SOURCE) -# ----------------------- -# Compile SOURCES into Java class files. Skip the test if java or javac is -# not installed. -m4_define([AT_JAVA_COMPILE], -[AT_CHECK([test -n "$CONF_JAVA" || exit 77 -test -n "$CONF_JAVAC" || exit 77]) -AT_CHECK([$SHELL ../../../javacomp.sh $1], - 0, [ignore], [ignore])]) - - -# AT_JAVA_PARSER_CHECK(COMMAND, EXIT-STATUS, EXPOUT, EXPERR, [PRE]) -# ----------------------------------------------------------------- -m4_define([AT_JAVA_PARSER_CHECK], -[AT_CHECK([$5 $SHELL ../../../javaexec.sh $1], [$2], [$3], [$4])]) - - # _AT_CHECK_JAVA_CALC_ERROR(BISON-OPTIONS, INPUT, # [VERBOSE-AND-LOCATED-ERROR-MESSAGE]) # --------------------------------------------------------- diff --git a/tests/local.at b/tests/local.at index 0e39479a..7ad75c3d 100644 --- a/tests/local.at +++ b/tests/local.at @@ -80,6 +80,8 @@ m4_pushdef([AT_DEFINES_IF], [m4_bmatch([$3], [%defines], [$1], [$2])]) m4_pushdef([AT_SKEL_CC_IF], [m4_bmatch([$3], [%language "[Cc]\+\+"\|%skeleton "[a-z0-9]+\.cc"], [$1], [$2])]) +m4_pushdef([AT_SKEL_JAVA_IF], +[m4_bmatch([$3], [%language "[Jj][Aa][Vv][Aa]"\|%skeleton "[a-z0-9]+\.java"], [$1], [$2])]) m4_pushdef([AT_GLR_IF], [m4_bmatch([$3], [%glr-parser\|%skeleton "glr\.], [$1], [$2])]) m4_pushdef([AT_LALR1_CC_IF], @@ -184,6 +186,7 @@ m4_popdef([AT_LEXPARAM_IF]) m4_popdef([AT_YACC_IF]) m4_popdef([AT_GLR_IF]) m4_popdef([AT_SKEL_CC_IF]) +m4_popdef([AT_SKEL_JAVA_IF]) m4_popdef([AT_GLR_CC_IF]) m4_popdef([AT_LALR1_CC_IF]) m4_popdef([AT_DEFINES_IF]) @@ -399,19 +402,38 @@ AT_CHECK([$BISON_CXX_WORKS], 0, ignore, ignore) AT_CHECK([$CXX $CXXFLAGS $CPPFLAGS m4_bmatch([$1], [[.]], [], [$LDFLAGS ])-o $1 m4_default([$2], [$1.cc])[]m4_bmatch([$1], [[.]], [], [ $LIBS])], 0, [ignore], [ignore])]) +# AT_JAVA_COMPILE(SOURCES) +# ------------------------ +# Compile SOURCES into Java class files. Skip the test if java or javac +# is not installed. +m4_define([AT_JAVA_COMPILE], +[AT_KEYWORDS(java) +AT_CHECK([[test -n "$CONF_JAVA" || exit 77 + test -n "$CONF_JAVAC" || exit 77]]) +AT_CHECK([[$SHELL ../../../javacomp.sh ]$1], + [[0]], [ignore], [ignore])]) # AT_FULL_COMPILE(OUTPUT, [OTHER]) # -------------------------------- -# Compile OUTPUT.y to OUTPUT.c or OUTPUT.cc, and compile it to OUTPUT. -# If OTHER is specified, compile OUTPUT-OTHER.c or OUTPUT-OTHER.cc to OUTPUT -# along with it. -# Relies on AT_SKEL_CC_IF. -m4_define([AT_FULL_COMPILE], -[AT_SKEL_CC_IF( - [AT_BISON_CHECK([-o $1.cc $1.y]) - AT_COMPILE_CXX([$1]m4_ifval($2, [, [$1.cc $1-$2.cc]]))], - [AT_BISON_CHECK([-o $1.c $1.y]) - AT_COMPILE([$1]m4_ifval($2, [, [$1.c $1-$2.c]]))]) +# Compile OUTPUT.y to OUTPUT.c, OUTPUT.cc, or OUTPUT.java, and then +# compile it to OUTPUT or OUTPUT.class. If OTHER is specified, compile +# OUTPUT-OTHER.c, OUTPUT-OTHER.cc, or OUTPUT-OTHER.java to OUTPUT or +# OUTPUT.java along with it. Relies on AT_SKEL_CC_IF and +# AT_SKEL_JAVA_IF. +m4_define([AT_FULL_COMPILE], [ + AT_SKEL_JAVA_IF([ + AT_BISON_CHECK([[-o ]$1[.java ]$1[.y]]) + AT_JAVA_COMPILE([$1[.java]]m4_ifval($2, + [[$1[.java ]$1[-]$2[.java]]])) + ], [ + AT_SKEL_CC_IF([ + AT_BISON_CHECK([[-o ]$1[.cc ]$1[.y]]) + AT_COMPILE_CXX([$1]m4_ifval($2, [, [$1[.cc ]$1[-]$2[.cc]]])) + ], [ + AT_BISON_CHECK([[-o ]$1[.c ]$1[.y]]) + AT_COMPILE([$1]m4_ifval($2, [, [$1[.c ]$1[-]$2[.c]]])) + ]) + ]) ]) @@ -425,6 +447,11 @@ m4_define([AT_FULL_COMPILE], m4_define([AT_PARSER_CHECK], [AT_CHECK([$5 $PREPARSER $1], [$2], [$3], [$4])]) +# AT_JAVA_PARSER_CHECK(COMMAND, EXIT-STATUS, EXPOUT, EXPERR, [PRE]) +# ----------------------------------------------------------------- +m4_define([AT_JAVA_PARSER_CHECK], +[AT_CHECK([$5[ $SHELL ../../../javaexec.sh ]$1], [$2], [$3], [$4])]) + # AT_TEST_TABLES_AND_PARSE(TITLE, COND-VALUE, TEST-SPEC, # DECLS, GRAMMAR, INPUT, # BISON-STDERR, TABLES-OR-LAST-STATE,