-*- outline -*-
-
-* URGENT: Prologue
-The %union is declared after the user C declarations. It can be
-a problem if YYSTYPE is declared after the user part.
-
-Actually, the real problem seems that the %union ought to be output
-where it was defined. For instance, in gettext/intl/plural.y, we
-have:
-
- %{
- ...
- #include "gettextP.h"
- ...
- %}
-
- %union {
- unsigned long int num;
- enum operator op;
- struct expression *exp;
- }
-
- %{
- ...
- static int yylex PARAMS ((YYSTYPE *lval, const char **pexp));
- ...
- %}
-
-Where the first part defines struct expression, the second uses it to
-define YYSTYPE, and the last uses YYSTYPE. Only this order is valid.
-
-Note that we have the same problem with GCC.
-
-I suggest splitting the prologue into pre-prologue and post-prologue.
-The reason is that:
-
-1. we keep language independance as it is the skeleton that joins the
-two prologues (there is no need for the engine to encode union yystype
-and to output it inside the prologue, which breaks the language
-independance of the generator)
-
-2. that makes it possible to have several %union in input. I think
-this is a pleasant (but useless currently) feature, but in the future,
-I want a means to %include other bits of grammars, and _then_ it will
-be important for the various bits to define their needs in %union.
+* Several %unions
+I think this is a pleasant (but useless currently) feature, but in the
+future, I want a means to %include other bits of grammars, and _then_
+it will be important for the various bits to define their needs in
+%union.
When implementing multiple-%union support, bare the following in mind:
char *sval;
}
-* Language independent actions
+* Experimental report features
+Decide whether they should be enabled, or optional. For instance, on:
+
+ input:
+ exp
+ | input exp
+ ;
+
+ exp:
+ token1 "1"
+ | token2 "2"
+ | token3 "3"
+ ;
+
+ token1: token;
+ token2: token;
+ token3: token;
+
+the traditional Bison reports:
+
+ state 0
+
+ $axiom -> . input $ (rule 0)
+
+ token shift, and go to state 1
+
+ input go to state 2
+ exp go to state 3
+ token1 go to state 4
+ token2 go to state 5
+ token3 go to state 6
+
+ state 1
+
+ token1 -> token . (rule 6)
+ token2 -> token . (rule 7)
+ token3 -> token . (rule 8)
+
+ "2" reduce using rule 7 (token2)
+ "3" reduce using rule 8 (token3)
+ $default reduce using rule 6 (token1)
+
+while with --trace, i.e., when enabling both the display of non-core
+item sets and the display of lookaheads, Bison now displays:
+
+ state 0
+
+ $axiom -> . input $ (rule 0)
+ input -> . exp (rule 1)
+ input -> . input exp (rule 2)
+ exp -> . token1 "1" (rule 3)
+ exp -> . token2 "2" (rule 4)
+ exp -> . token3 "3" (rule 5)
+ token1 -> . token (rule 6)
+ token2 -> . token (rule 7)
+ token3 -> . token (rule 8)
-Currently bison, the generator, transforms $1, $$ and so forth into
-direct C code, manipulating the stacks. This is problematic, because
-(i) it means that if we want more languages, we need to update the
-generator, and (ii), it forces names everywhere (e.g., the C++
-skeleton would be happy to use other naming schemes, and actually,
-even other accessing schemes).
+ token shift, and go to state 1
-Therefore we want
+ input go to state 2
+ exp go to state 3
+ token1 go to state 4
+ token2 go to state 5
+ token3 go to state 6
-1. the generator to replace $1, etc. by M4 macro invocations
- (b4_dollar(1), b4_at(3), b4_dollar_dollar) etc.
+ state 1
-2. the skeletons to define these macros.
+ token1 -> token . ["1"] (rule 6)
+ token2 -> token . ["2"] (rule 7)
+ token3 -> token . ["3"] (rule 8)
-But currently the actions are double-quoted, to protect them from M4
-evaluation. So we need to:
+ "2" reduce using rule 7 (token2)
+ "3" reduce using rule 8 (token3)
+ $default reduce using rule 6 (token1)
-3. stop quoting them
+so decide whether this should be an option, or always enabled. I'm in
+favor of making it the default, but maybe we should tune the output to
+distinguish core item sets from non core:
-4. change the [ and ] in the actions into @<:@ and @:>@
+ state 0
+ Core:
+ $axiom -> . input $ (rule 0)
-5. extend the postprocessor to maps these back onto [ and ].
+ Derived:
+ input -> . exp (rule 1)
+ input -> . input exp (rule 2)
+ exp -> . token1 "1" (rule 3)
+ exp -> . token2 "2" (rule 4)
+ exp -> . token3 "3" (rule 5)
+ token1 -> . token (rule 6)
+ token2 -> . token (rule 7)
+ token3 -> . token (rule 8)
+
+ token shift, and go to state 1
+
+ input go to state 2
+ exp go to state 3
+ token1 go to state 4
+ token2 go to state 5
+ token3 go to state 6
+
+
+Note that the same questions applies to --graph.
* Coding system independence
Paul notes:
PDP-10 ports :-) but they should probably be documented
somewhere.
-* Using enums instead of int for tokens.
-Paul suggests:
-
- #ifndef YYTOKENTYPE
- # if defined (__STDC__) || defined (__cplusplus)
- /* Put the tokens into the symbol table, so that GDB and other debuggers
- know about them. */
- enum yytokentype {
- FOO = 256,
- BAR,
- ...
- };
- /* POSIX requires `int' for tokens in interfaces. */
- # define YYTOKENTYPE int
- # endif
- #endif
- #define FOO 256
- #define BAR 257
- ...
-
-> I'm in favor of
->
-> %token FOO 256
-> %token BAR 257
->
-> and Bison moves error into 258.
-
-Yes, I think that's a valid extension too, if the user doesn't define
-the token number for error.
-
* Output directory
Akim:
** %no-lines [ok]
** %no-parser []
** %pure-parser []
-** %semantic-parser []
** %token-table []
** Options which could use parse_dquoted_param ().
Maybe transfered in lex.c.
makes it impossible to have modular precedence information. We should
move to partial orders.
-* Parsing grammars
-Rewrite the reader in Bison.
-
-* Problems with aliases
-From: "Baum, Nathan I" <s0009525@chelt.ac.uk>
-Subject: Token Alias Bug
-To: "'bug-bison@gnu.org'" <bug-bison@gnu.org>
-
-I've noticed a bug in bison. Sadly, our eternally wise sysadmins won't let
-us use CVS, so I can't find out if it's been fixed already...
-
-Basically, I made a program (in flex) that went through a .y file looking
-for "..."-tokens, and then outputed a %token
-line for it. For single-character ""-tokens, I reasoned, I could just use
-[%token 'A' "A"]. However, this causes Bison to output a [#define 'A' 65],
-which cppp chokes on, not unreasonably. (And even if cppp didn't choke, I
-obviously wouldn't want (char)'A' to be replaced with (int)65 throughout my
-code.
-
-Bison normally forgoes outputing a #define for a character token. However,
-it always outputs an aliased token -- even if the token is an alias for a
-character token. We don't want that. The problem is in /output.c/, as I
-recall. When it outputs the token definitions, it checks for a character
-token, and then checks for an alias token. If the character token check is
-placed after the alias check, then it works correctly.
-
-Alias tokens seem to be something of a kludge. What about an [%alias "..."]
-command...
-
- %alias T_IF "IF"
+This will be possible with a Bison parser for the grammar, as it will
+make it much easier to extend the grammar.
-Hmm. I can't help thinking... What about a --generate-lex option that
-creates an .l file for the alias tokens used... (Or an option to make a
-gperf file, etc...)
+* Parsing grammars
+Rewrite the reader in Flex/Bison. There will be delicate parts, in
+particular, expect the scanner to be hard to write. Many interesting
+features cannot be implemented without such a new reader.
* Presentation of the report file
From: "Baum, Nathan I" <s0009525@chelt.ac.uk>
conflicts? (Or, perhaps, a "*** This state has N conflicts ***" marker above
each state with conflicts.)
-
* $undefined
From Hans:
- If the Bison generated parser experiences an undefined number in the
a Bison option where every typed default rule is explicitly written out
(same typed ruled can of course be grouped together).
+Note: Robert Anisko handles this. He knows how to do it.
+
+* Documenting C++ output
+Write a first documentation for C++ output.
+
+* Warnings
+It would be nice to have warning support. See how Autoconf handles
+them, it is fairly well described there. It would be very nice to
+implement this in such a way that other programs could use
+lib/warnings.[ch].
+
+Don't work on this without first announcing you do, as I already have
+thought about it, and know many of the components that can be used to
+implement it.
+
* Pre and post actions.
From: Florian Krohm <florian@edamail.fishkill.ibm.com>
Subject: YYACT_EPILOGUE