X-Git-Url: https://git.saurik.com/bison.git/blobdiff_plain/69991a58278c6af21590fcc52b58d6eb44d0552f..a46c84724424ad3b05b1ce587b3c65ae4f8945a9:/TODO diff --git a/TODO b/TODO index 0b12b008..4780351f 100644 --- a/TODO +++ b/TODO @@ -1,5 +1,130 @@ -*- outline -*- +* Several %unions +I think this is a pleasant (but useless currently) feature, but in the +future, I want a means to %include other bits of grammars, and _then_ +it will be important for the various bits to define their needs in +%union. + +When implementing multiple-%union support, bare the following in mind: + +- when --yacc, this must be flagged as an error. Don't make it fatal + though. + +- The #line must now appear *inside* the definition of yystype. + Something like + + { + #line 12 "foo.y" + int ival; + #line 23 "foo.y" + char *sval; + } + +* Experimental report features +Decide whether they should be enabled, or optional. For instance, on: + + input: + exp + | input exp + ; + + exp: + token1 "1" + | token2 "2" + | token3 "3" + ; + + token1: token; + token2: token; + token3: token; + +the traditional Bison reports: + + state 0 + + $axiom -> . input $ (rule 0) + + token shift, and go to state 1 + + input go to state 2 + exp go to state 3 + token1 go to state 4 + token2 go to state 5 + token3 go to state 6 + + state 1 + + token1 -> token . (rule 6) + token2 -> token . (rule 7) + token3 -> token . (rule 8) + + "2" reduce using rule 7 (token2) + "3" reduce using rule 8 (token3) + $default reduce using rule 6 (token1) + +while with --trace, i.e., when enabling both the display of non-core +item sets and the display of lookaheads, Bison now displays: + + state 0 + + $axiom -> . input $ (rule 0) + input -> . exp (rule 1) + input -> . input exp (rule 2) + exp -> . token1 "1" (rule 3) + exp -> . token2 "2" (rule 4) + exp -> . token3 "3" (rule 5) + token1 -> . token (rule 6) + token2 -> . token (rule 7) + token3 -> . token (rule 8) + + token shift, and go to state 1 + + input go to state 2 + exp go to state 3 + token1 go to state 4 + token2 go to state 5 + token3 go to state 6 + + state 1 + + token1 -> token . ["1"] (rule 6) + token2 -> token . ["2"] (rule 7) + token3 -> token . ["3"] (rule 8) + + "2" reduce using rule 7 (token2) + "3" reduce using rule 8 (token3) + $default reduce using rule 6 (token1) + +so decide whether this should be an option, or always enabled. I'm in +favor of making it the default, but maybe we should tune the output to +distinguish core item sets from non core: + + state 0 + Core: + $axiom -> . input $ (rule 0) + + Derived: + input -> . exp (rule 1) + input -> . input exp (rule 2) + exp -> . token1 "1" (rule 3) + exp -> . token2 "2" (rule 4) + exp -> . token3 "3" (rule 5) + token1 -> . token (rule 6) + token2 -> . token (rule 7) + token3 -> . token (rule 8) + + token shift, and go to state 1 + + input go to state 2 + exp go to state 3 + token1 go to state 4 + token2 go to state 5 + token3 go to state 6 + + +Note that the same questions applies to --graph. + * Coding system independence Paul notes: @@ -13,36 +138,6 @@ Paul notes: PDP-10 ports :-) but they should probably be documented somewhere. -* Using enums instead of int for tokens. -Paul suggests: - - #ifndef YYTOKENTYPE - # if defined (__STDC__) || defined (__cplusplus) - /* Put the tokens into the symbol table, so that GDB and other debuggers - know about them. */ - enum yytokentype { - FOO = 256, - BAR, - ... - }; - /* POSIX requires `int' for tokens in interfaces. */ - # define YYTOKENTYPE int - # endif - #endif - #define FOO 256 - #define BAR 257 - ... - -> I'm in favor of -> -> %token FOO 256 -> %token BAR 257 -> -> and Bison moves error into 258. - -Yes, I think that's a valid extension too, if the user doesn't define -the token number for error. - * Output directory Akim: @@ -171,40 +266,6 @@ critical for user data: when aborting a parsing, when handling the error token etc., we often throw away yylval without giving a chance of cleaning it up to the user. -* NEWS -Sort from 1.31 NEWS. - -* Prologue -The %union is declared after the user C declarations. It can be -a problem if YYSTYPE is declared after the user part. [] - -Actually, the real problem seems that the %union ought to be output -where it was defined. For instance, in gettext/intl/plural.y, we -have: - - %{ - ... - #include "gettextP.h" - ... - %} - - %union { - unsigned long int num; - enum operator op; - struct expression *exp; - } - - %{ - ... - static int yylex PARAMS ((YYSTYPE *lval, const char **pexp)); - ... - %} - -Where the first part defines struct expression, the second uses it to -define YYSTYPE, and the last uses YYSTYPE. Only this order is valid. - -Note that we have the same problem with GCC. - * --graph Show reductions. [] @@ -212,7 +273,6 @@ Show reductions. [] ** %no-lines [ok] ** %no-parser [] ** %pure-parser [] -** %semantic-parser [] ** %token-table [] ** Options which could use parse_dquoted_param (). Maybe transfered in lex.c. @@ -313,40 +373,13 @@ It is unfortunate that there is a total order for precedence. It makes it impossible to have modular precedence information. We should move to partial orders. -* Parsing grammars -Rewrite the reader in Bison. +This will be possible with a Bison parser for the grammar, as it will +make it much easier to extend the grammar. -* Problems with aliases -From: "Baum, Nathan I" -Subject: Token Alias Bug -To: "'bug-bison@gnu.org'" - -I've noticed a bug in bison. Sadly, our eternally wise sysadmins won't let -us use CVS, so I can't find out if it's been fixed already... - -Basically, I made a program (in flex) that went through a .y file looking -for "..."-tokens, and then outputed a %token -line for it. For single-character ""-tokens, I reasoned, I could just use -[%token 'A' "A"]. However, this causes Bison to output a [#define 'A' 65], -which cppp chokes on, not unreasonably. (And even if cppp didn't choke, I -obviously wouldn't want (char)'A' to be replaced with (int)65 throughout my -code. - -Bison normally forgoes outputing a #define for a character token. However, -it always outputs an aliased token -- even if the token is an alias for a -character token. We don't want that. The problem is in /output.c/, as I -recall. When it outputs the token definitions, it checks for a character -token, and then checks for an alias token. If the character token check is -placed after the alias check, then it works correctly. - -Alias tokens seem to be something of a kludge. What about an [%alias "..."] -command... - - %alias T_IF "IF" - -Hmm. I can't help thinking... What about a --generate-lex option that -creates an .l file for the alias tokens used... (Or an option to make a -gperf file, etc...) +* Parsing grammars +Rewrite the reader in Flex/Bison. There will be delicate parts, in +particular, expect the scanner to be hard to write. Many interesting +features cannot be implemented without such a new reader. * Presentation of the report file From: "Baum, Nathan I" @@ -362,7 +395,6 @@ everything, but the -v mode only tells you what you need for examining conflicts? (Or, perhaps, a "*** This state has N conflicts ***" marker above each state with conflicts.) - * $undefined From Hans: - If the Bison generated parser experiences an undefined number in the @@ -385,6 +417,21 @@ $$ = $1. I therefore think that one should implement a Bison option where every typed default rule is explicitly written out (same typed ruled can of course be grouped together). +Note: Robert Anisko handles this. He knows how to do it. + +* Documenting C++ output +Write a first documentation for C++ output. + +* Warnings +It would be nice to have warning support. See how Autoconf handles +them, it is fairly well described there. It would be very nice to +implement this in such a way that other programs could use +lib/warnings.[ch]. + +Don't work on this without first announcing you do, as I already have +thought about it, and know many of the components that can be used to +implement it. + * Pre and post actions. From: Florian Krohm Subject: YYACT_EPILOGUE @@ -400,7 +447,7 @@ The way I solved this was to define a macro YYACT_EPILOGUE that would be invoked after the action. For reasons of symmetry I also added YYACT_PROLOGUE. Although I had no use for that I can envision how it might come in handy for debugging purposes. -All is needed is to add +All is needed is to add #if YYLSP_NEEDED YYACT_EPILOGUE (yyval, (yyvsp - yylen), yylen, yyloc, (yylsp - yylen));