X-Git-Url: https://git.saurik.com/bison.git/blobdiff_plain/3c5362b825a9d01eafe257943b7faad92ea43a05..dfe292c695e1b6dcc7bc7257b59b98be960faff8:/TODO diff --git a/TODO b/TODO index 7f4c5148..21ef4b91 100644 --- a/TODO +++ b/TODO @@ -1,31 +1,11 @@ --*- outline -*- - * Short term -** Use syntax_error from the scanner? -This would provide a means to raise syntax error from function called -from the scanner. Actually, there is no good solution to report a -lexical error in general. Usually they are kept at the scanner level -only, ignoring the guilty token. But that might not be the best bet, -since we don't benefit from the syntactic error recovery. - -We still have the possibility to return an invalid token number, which -does the trick. But then the error message from the parser is poor -(something like "unexpected $undefined"). Since the scanner probably -already reported the error, we should directly enter error-recovery, -without reporting the error message (i.e., YYERROR's semantics). - -Back to lalr1.cc (whose name is now quite unfortunate, since it also -covers lr and ielr), if we support exceptions from yylex, should we -propose a lexical_error in addition to syntax_error? Should they have -a common root, say parse_error? Should syntax_error be renamed -syntactic_error for consistency with lexical_error? - ** Variable names. What should we name `variant' and `lex_symbol'? ** Use b4_symbol in all the skeleton -Then remove the older system, including the tables generated by -output.c +Move its definition in the more standard places and deploy it in other +skeletons. Then remove the older system, including the tables +generated by output.c ** Update the documentation on gnu.org @@ -58,17 +38,10 @@ as lr0.cc, why upper case? ** bench several bisons. Enhance bench.pl with %b to run different bisons. -** Use b4_symbol everywhere. -Move its definition in the more standard places and deploy it in other -skeletons. - * Various -** YYPRINT -glr.c inherits its symbol_print function from c.m4, which supports -YYPRINT. But to use YYPRINT yytoknum is needed, which not defined by -glr.c. - -Anyway, IMHO YYPRINT is obsolete and should be restricted to yacc.c. +** Warnings +Warnings about type tags that are used in printer and dtors, but not +for symbols? ** YYERRCODE Defined to 256, but not used, not documented. Probably the token @@ -118,59 +91,15 @@ so both 256 and 257 are "mysterious". ** YYFAIL It is seems to be *really* obsolete now, shall we remove it? -** YYBACKUP -There is no test about it, no examples in the doc, and I'm not sure -what it should look like. For instance what follows crashes. - - %error-verbose - %debug - %pure-parser - %code { - # include - # include - # include - - static void yyerror (const char *msg); - static int yylex (YYSTYPE *yylval); - } - %% - exp: - 'a' { printf ("a: %d\n", $1); } - | 'b' { YYBACKUP('a', 123); } - ; - %% - static int - yylex (YYSTYPE *yylval) - { - static char const input[] = "b"; - static size_t toknum; - assert (toknum < sizeof input); - *yylval = (toknum + 1) * 10; - return input[toknum++]; - } - - static void - yyerror (const char *msg) - { - fprintf (stderr, "%s\n", msg); - } - - int - main (void) - { - yydebug = !!getenv("YYDEBUG"); - return yyparse (); - } - ** yychar == yyempty_ The code in yyerrlab reads: if (yychar <= YYEOF) - { - /* Return failure if at end of input. */ - if (yychar == YYEOF) - YYABORT; - } + { + /* Return failure if at end of input. */ + if (yychar == YYEOF) + YYABORT; + } There are only two yychar that can be <= YYEOF: YYEMPTY and YYEOF. But I can't produce the situation where yychar is YYEMPTY here, is it @@ -196,10 +125,6 @@ we do the same in yacc.c. The code bw glr.c and yacc.c is really alike, we can certainly factor some parts. -* Header guards - -From Franc,ois: should we keep the directory part in the CPP guard? - * Yacc.c: CPP Macros @@ -207,13 +132,6 @@ Do some people use YYPURE, YYLSP_NEEDED like we do in the test suite? They should not: it is not documented. But if they need to, let's find something clean (not like YYLSP_NEEDED...). - -* Installation - -* Documentation -Before releasing, make sure the documentation ("Understanding your -parser") refers to the current `output' format. - * Report ** Figures @@ -251,36 +169,11 @@ DeRemer and Penello: they already provide the algorithm. * Extensions -** Labeling the symbols -Have a look at the Lemon parser generator: instead of $1, $2 etc. they -can name the values. This is much more pleasant. For instance: - - exp (res): exp (a) '+' exp (b) { $res = $a + $b; }; - -I love this. I have been bitten too often by the removal of the -symbol, and forgetting to shift all the $n to $n-1. If you are -unlucky, it compiles... - -But instead of using $a etc., we can use regular variables. And -instead of using (), I propose to use `:' (again). Paul suggests -supporting `->' in addition to `:' to separate LHS and RHS. In other -words: - - r:exp -> a:exp '+' b:exp { r = a + b; }; - -That requires an significant improvement of the grammar parser. Using -GLR would be nice. It also requires that Bison know the type of the -symbols (which will be useful for %include anyway). So we have some -time before... - -Note that there remains the problem of locations: `@r'? - - ** $-1 We should find a means to provide an access to values deep in the stack. For instance, instead of - baz: qux { $$ = $-1 + $0 + $1; } + baz: qux { $$ = $-1 + $0 + $1; } we should be able to have: @@ -313,13 +206,13 @@ XML output for GNU Bison * Unit rules Maybe we could expand unit rules, i.e., transform - exp: arith | bool; - arith: exp '+' exp; - bool: exp '&' exp; + exp: arith | bool; + arith: exp '+' exp; + bool: exp '&' exp; into - exp: exp '+' exp | exp '&' exp; + exp: exp '+' exp | exp '&' exp; when there are no actions. This can significantly speed up some grammars. I can't find the papers. In particular the book `LR @@ -335,28 +228,22 @@ this issue. Does anybody have it? Some history of Bison and some bibliography would be most welcome. Are there any Texinfo standards for bibliography? -** %printer -Wow, %printer is not documented. Clearly mark YYPRINT as obsolete. - -* Java, Fortran, etc. - - * Coding system independence Paul notes: - Currently Bison assumes 8-bit bytes (i.e. that UCHAR_MAX is - 255). It also assumes that the 8-bit character encoding is - the same for the invocation of 'bison' as it is for the - invocation of 'cc', but this is not necessarily true when - people run bison on an ASCII host and then use cc on an EBCDIC - host. I don't think these topics are worth our time - addressing (unless we find a gung-ho volunteer for EBCDIC or - PDP-10 ports :-) but they should probably be documented - somewhere. + Currently Bison assumes 8-bit bytes (i.e. that UCHAR_MAX is + 255). It also assumes that the 8-bit character encoding is + the same for the invocation of 'bison' as it is for the + invocation of 'cc', but this is not necessarily true when + people run bison on an ASCII host and then use cc on an EBCDIC + host. I don't think these topics are worth our time + addressing (unless we find a gung-ho volunteer for EBCDIC or + PDP-10 ports :-) but they should probably be documented + somewhere. - More importantly, Bison does not currently allow NUL bytes in - tokens, either via escapes (e.g., "x\0y") or via a NUL byte in - the source code. This should get fixed. + More importantly, Bison does not currently allow NUL bytes in + tokens, either via escapes (e.g., "x\0y") or via a NUL byte in + the source code. This should get fixed. * --graph Show reductions. @@ -366,29 +253,6 @@ Show reductions. ** Skeleton strategy Must we keep %token-table? -* BTYacc -See if we can integrate backtracking in Bison. Charles-Henri de -Boysson has been working on this, but never gave -the results. - -Vadim Maslow, the maintainer of BTYacc was once contacted. Adjusting -the Bison grammar parser will be needed to support some extra BTYacc -features. This is less urgent. - -** Keeping the conflicted actions -First, analyze the differences between byacc and btyacc (I'm referring -to the executables). Find where the conflicts are preserved. - -** Compare with the GLR tables -See how isomorphic the way BTYacc and the way the GLR adjustments in -Bison are compatible. *As much as possible* one should try to use the -same implementation in the Bison executables. I insist: it should be -very feasible to use the very same conflict tables. - -** Adjust the skeletons -Import the skeletons for C and C++. - - * Precedence ** Partial order @@ -472,10 +336,15 @@ Here's a proposal for how a new implementation might look: http://lists.gnu.org/archive/html/bison-patches/2009-09/msg00086.html + +Local Variables: +mode: outline +coding: utf-8 +End: + ----- -Copyright (C) 2001, 2002, 2003, 2004, 2006, 2008-2009 Free Software -Foundation, Inc. +Copyright (C) 2001-2004, 2006, 2008-2012 Free Software Foundation, Inc. This file is part of Bison, the GNU Compiler Compiler.