X-Git-Url: https://git.saurik.com/bison.git/blobdiff_plain/052826fdd1832fac13c0e7dcb150154cfb22db4f..231897ad21e478b0b0821636aa49fe49ca0f3301:/doc/bison.texinfo diff --git a/doc/bison.texinfo b/doc/bison.texinfo index 03da6c8f..3ed6b7f5 100644 --- a/doc/bison.texinfo +++ b/doc/bison.texinfo @@ -284,6 +284,8 @@ Invoking Bison Frequently Asked Questions * Parser Stack Overflow:: Breaking the Stack Limits +* How Can I Reset @code{yyparse}:: @code{yyparse} Keeps some State +* Strings are Destroyed:: @code{yylval} Loses Track of Strings Copying This Manual @@ -6352,6 +6354,8 @@ are addressed. @menu * Parser Stack Overflow:: Breaking the Stack Limits +* How Can I Reset @code{yyparse}:: @code{yyparse} Keeps some State +* Strings are Destroyed:: @code{yylval} Loses Track of Strings @end menu @node Parser Stack Overflow @@ -6365,6 +6369,148 @@ message. What can I do? This question is already addressed elsewhere, @xref{Recursion, ,Recursive Rules}. +@node How Can I Reset @code{yyparse} +@section How Can I Reset @code{yyparse} + +The following phenomenon gives raise to several incarnations, +resulting in the following typical questions: + +@display +I invoke @code{yyparse} several times, and on correct input it works +properly; but when a parse error is found, all the other calls fail +too. How can I reset @code{yyparse}'s error flag? +@end display + +@noindent +or + +@display +My parser includes support for a @samp{#include} like feature, in +which case I run @code{yyparse} from @code{yyparse}. This fails +although I did specify I needed a @code{%pure-parser}. +@end display + +These problems are not related to Bison itself, but with the Lex +generated scanners. Because these scanners use large buffers for +speed, they might not notice a change of input file. As a +demonstration, consider the following source file, +@file{first-line.l}: + +@verbatim +%{ +#include +#include +%} +%% +.*\n ECHO; return 1; +%% +int +yyparse (const char *file) +{ + yyin = fopen (file, "r"); + if (!yyin) + exit (2); + /* One token only. */ + yylex (); + if (!fclose (yyin)) + exit (3); + return 0; +} + +int +main () +{ + yyparse ("input"); + yyparse ("input"); + return 0; +} +@end verbatim + +@noindent +If the file @file{input} contains + +@verbatim +input:1: Hello, +input:2: World! +@end verbatim + +@noindent +then instead of getting twice the first line, you get: + +@example +$ @kbd{flex -ofirst-line.c first-line.l} +$ @kbd{gcc -ofirst-line first-line.c -ll} +$ @kbd{./first-line} +input:1: Hello, +input:2: World! +@end example + +Therefore, whenever you change @code{yyin}, you must tell the Lex +generated scanner to discard its current buffer, and to switch to the +new one. This depends upon your implementation of Lex, see its +documentation for more. For instance, in the case of Flex, a simple +call @samp{yyrestart (yyin)} suffices after each change to +@code{yyin}. + +@node Strings are Destroyed +@section Strings are Destroyed + +@display +My parser seems to destroy old strings, or maybe it losses track of +them. Instead of reporting @samp{"foo", "bar"}, it reports +@samp{"bar", "bar"}, or even @samp{"foo\nbar", "bar"}. +@end display + +This error is probably the single most frequent ``bug report'' sent to +Bison lists, but is only concerned with a misunderstanding of the role +of scanner. Consider the following Lex code: + +@verbatim +%{ +#include +char *yylval = NULL; +%} +%% +.* yylval = yytext; return 1; +\n /* IGNORE */ +%% +int +main () +{ + /* Similar to using $1, $2 in a Bison action. */ + char *fst = (yylex (), yylval); + char *snd = (yylex (), yylval); + printf ("\"%s\", \"%s\"\n", fst, snd); + return 0; +} +@end verbatim + +If you compile and run this code, you get: + +@example +$ @kbd{flex -osplit-lines.c split-lines.l} +$ @kbd{gcc -osplit-lines split-lines.c -ll} +$ @kbd{printf 'one\ntwo\n' | ./split-lines} +"one +two", "two" +@end example + +@noindent +this is because @code{yytext} is a buffer provided for @emph{reading} +in the action, but if you want to keep it, you have to duplicate it +(e.g., using @code{strdup}). Note that the output may depend on how +your implementation of Lex handles @code{yytext}. For instance, when +given the Lex compatibility option @option{-l} (which triggers the +option @samp{%array}) Flex generates a different behavior: + +@example +$ @kbd{flex -l -osplit-lines.c split-lines.l} +$ @kbd{gcc -osplit-lines split-lines.c -ll} +$ @kbd{printf 'one\ntwo\n' | ./split-lines} +"two", "two" +@end example + + @c ================================================= Table of Symbols @node Table of Symbols