+* Short term
+** Graphviz display code thoughts
+The code for the --graph option is over two files: print_graph, and
+graphviz. I believe this is because Bison used to also produce VCG graphs,
+but since this is no longer true, maybe we could consider these files for
+Little effort factoring seems to have been given to factoring in these files,
+and their print-xml and print counterpart. We would very much like to re-use
+the pretty format of states from .output in the .dot
+Also, the underscore in print_graph.[ch] isn't very fitting considering
+the dashes in the other filenames.
+** push-parser
+Check it too when checking the different kinds of parsers. And be
+sure to check that the initial-action is performed once per parsing.
+** m4 names
+b4_shared_declarations is no longer what it is. Make it
+b4_parser_declaration for instance.
+** yychar in lalr1.cc
+There is a large difference bw maint and master on the handling of
+yychar (which was removed in lalr1.cc). See what needs to be
+ /* User semantic actions sometimes alter yychar, and that requires
+ that yytoken be updated with the new translation. We take the
+ approach of translating immediately before every use of yytoken.
+ One alternative is translating here after every semantic action,
+ but that translation would be missed if the semantic action
+ invokes YYABORT, YYACCEPT, or YYERROR immediately after altering
+ yychar. In the case of YYABORT or YYACCEPT, an incorrect
+ destructor might then be invoked immediately. In the case of
+ YYERROR, subsequent parser actions might lead to an incorrect
+ destructor call or verbose syntax error message before the
+ lookahead is translated. */
+ /* Make sure we have latest lookahead translation. See comments at
+ user semantic actions for why this is necessary. */
+ yytoken = yytranslate_ (yychar);
+** stack.hh
+Get rid of it. The original idea is nice, but actually it makes
+the code harder to follow, and uselessly different from the other
+** Variable names.
+What should we name `variant' and `lex_symbol'?
+** Get rid of fake #lines [Bison: ...]
+Possibly as simple as checking whether the column number is nonnegative.
+I have seen messages like the following from GCC.
+<built-in>:0: fatal error: opening dependency file .deps/libltdl/argz.Tpo: No such file or directory
+** Discuss about %printer/%destroy in the case of C++.
+It would be very nice to provide the symbol classes with an operator<<
+and a destructor. Unfortunately the syntax we have chosen for
+%destroy and %printer make them hard to reuse. For instance, the user
+is invited to write something like
+ %printer { debug_stream() << $$; } <my_type>;
+which is hard to reuse elsewhere since it wants to use
+"debug_stream()" to find the stream to use. The same applies to
+%destroy: we told the user she could use the members of the Parser
+class in the printers/destructors, which is not good for an operator<<
+since it is no longer bound to a particular parser, it's just a
+(standalone symbol).
+** Rename LR0.cc
+as lr0.cc, why upper case?
+** bench several bisons.
+Enhance bench.pl with %b to run different bisons.
+* Various
+Defined to 256, but not used, not documented. Probably the token
+number for the error token, which POSIX wants to be 256, but which
+Bison might renumber if the user used number 256. Keep fix and doc?
+Throw away?
+Also, why don't we output the token name of the error token in the
+output? It is explicitly skipped:
+ /* Skip error token and tokens without identifier. */
+ if (sym != errtoken && id)
+Of course there are issues with name spaces, but if we disable we have
+something which seems to be more simpler and more consistent instead
+of the special case YYERRCODE.
+ enum yytokentype {
+ error = 256,
+ // ...
+ };
+We could (should?) also treat the case of the undef_token, which is
+numbered 257 for yylex, and 2 internal. Both appear for instance in
+ const unsigned short int
+ parser::yytoken_number_[] =
+ {
+ 0, 256, 257, 258, 259, 260, 261, 262, 263, 264,
+while here
+ enum yytokentype {
+ TOK_EOF = 0,
+ TOK_EQ = 258,
+so both 256 and 257 are "mysterious".
+ const char*
+ const parser::yytname_[] =
+ {
+ "\"end of command\"", "error", "$undefined", "\"=\"", "\"break\"",
+** yychar == yyempty_
+The code in yyerrlab reads:
+ if (yychar <= YYEOF)
+ {
+ /* Return failure if at end of input. */
+ if (yychar == YYEOF)
+ }
+There are only two yychar that can be <= YYEOF: YYEMPTY and YYEOF.
+But I can't produce the situation where yychar is YYEMPTY here, is it
+really possible? The test suite does not exercise this case.
+This shows that it would be interesting to manage to install skeleton
+coverage analysis to the test suite.
+** Table definitions
+It should be very easy to factor the definition of the various tables,
+including the separation bw declaration and definition. See for
+instance b4_table_define in lalr1.cc. This way, we could even factor
+C vs. C++ definitions.
+* From lalr1.cc to yacc.c
+** Single stack
+Merging the three stacks in lalr1.cc simplified the code, prompted for
+other improvements and also made it faster (probably because memory
+management is performed once instead of three times). I suggest that
+we do the same in yacc.c.
+** yysyntax_error
+The code bw glr.c and yacc.c is really alike, we can certainly factor
+some parts.
+* Report
+** Figures
+Some statistics about the grammar and the parser would be useful,
+especially when asking the user to send some information about the
+grammars she is working on. We should probably also include some
+information about the variables (I'm not sure for instance we even
+specify what LR variant was used).
+** GLR
+How would Paul like to display the conflicted actions? In particular,
+what when two reductions are possible on a given lookahead token, but one is
+part of $default. Should we make the two reductions explicit, or just
+keep $default? See the following point.
+** Disabled Reductions
+See `tests/conflicts.at (Defaulted Conflicted Reduction)', and decide
+what we want to do.
+** Documentation
+Extend with error productions. The hard part will probably be finding
+the right rule so that a single state does not exhibit too many yet
+undocumented ``features''. Maybe an empty action ought to be
+presented too. Shall we try to make a single grammar with all these
+features, or should we have several very small grammars?
+** --report=conflict-path
+Provide better assistance for understanding the conflicts by providing
+a sample text exhibiting the (LALR) ambiguity. See the paper from
+DeRemer and Penello: they already provide the algorithm.
+** Statically check for potential ambiguities in GLR grammars. See
+<http://www.i3s.unice.fr/~schmitz/papers.html#expamb> for an approach.
+* Extensions
+** $-1
+We should find a means to provide an access to values deep in the
+stack. For instance, instead of
+ baz: qux { $$ = $<foo>-1 + $<bar>0 + $1; }
+we should be able to have:
+ foo($foo) bar($bar) baz($bar): qux($qux) { $baz = $foo + $bar + $qux; }
+Or something like this.
+** %if and the like
+It should be possible to have %if/%else/%endif. The implementation is
+not clear: should it be lexical or syntactic. Vadim Maslow thinks it
+must be in the scanner: we must not parse what is in a switched off
+part of %if. Akim Demaille thinks it should be in the parser, so as
+to avoid falling into another CPP mistake.
+** XML Output
+There are couple of available extensions of Bison targeting some XML
+output. Some day we should consider including them. One issue is
+that they seem to be quite orthogonal to the parsing technique, and
+seem to depend mostly on the possibility to have some code triggered
+for each reduction. As a matter of fact, such hooks could also be
+used to generate the yydebug traces. Some generic scheme probably
+exists in there.
+XML output for GNU Bison and gcc
+ http://www.cs.may.ie/~jpower/Research/bisonXML/
+XML output for GNU Bison
+ http://yaxx.sourceforge.net/