[bison.git] / TODO

-*- outline -*-


* URGENT: Prologue
The %union is declared after the user C declarations. It can be
a problem if YYSTYPE is declared after the user part.

Actually, the real problem seems that the %union ought to be output
where it was defined.  For instance, in gettext/intl/plural.y, we
have:

	%{
	...
	#include "gettextP.h"
	...
	%}

	%union {
	  unsigned long int num;
	  enum operator op;
	  struct expression *exp;
	}

	%{
	...
	static int yylex PARAMS ((YYSTYPE *lval, const char **pexp));
	...
	%}

Where the first part defines struct expression, the second uses it to
define YYSTYPE, and the last uses YYSTYPE.  Only this order is valid.

Note that we have the same problem with GCC.

I suggest splitting the prologue into pre-prologue and post-prologue.
The reason is that:

1. we keep language independance as it is the skeleton that joins the
two prologues (there is no need for the engine to encode union yystype
and to output it inside the prologue, which breaks the language
independance of the generator)

2. that makes it possible to have several %union in input.  I think
this is a pleasant (but useless currently) feature, but in the future,
I want a means to %include other bits of grammars, and _then_ it will
be important for the various bits to define their needs in %union.

When implementing multiple-%union support, bare the following in mind:

- when --yacc, this must be flagged as an error.  Don't make it fatal
  though.

- The #line must now appear *inside* the definition of yystype.
  Something like

	{
	#line 12 "foo.y"
	  int ival;
	#line 23 "foo.y"
	  char *sval;
	}

* Language independent actions

Currently bison, the generator, transforms $1, $$ and so forth into
direct C code, manipulating the stacks.  This is problematic, because
(i) it means that if we want more languages, we need to update the
generator, and (ii), it forces names everywhere (e.g., the C++
skeleton would be happy to use other naming schemes, and actually,
even other accessing schemes).

Therefore we want

1. the generator to replace $1, etc. by M4 macro invocations
   (b4_dollar(1), b4_at(3), b4_dollar_dollar) etc.

2. the skeletons to define these macros.

But currently the actions are double-quoted, to protect them from M4
evaluation.  So we need to:

3. stop quoting them

4. change the [ and ] in the actions into @<:@ and @:>@

5. extend the postprocessor to maps these back onto [ and ].

* Coding system independence
Paul notes:

	Currently Bison assumes 8-bit bytes (i.e. that UCHAR_MAX is
	255).  It also assumes that the 8-bit character encoding is
	the same for the invocation of 'bison' as it is for the
	invocation of 'cc', but this is not necessarily true when
	people run bison on an ASCII host and then use cc on an EBCDIC
	host.  I don't think these topics are worth our time
	addressing (unless we find a gung-ho volunteer for EBCDIC or
	PDP-10 ports :-) but they should probably be documented
	somewhere.

* Using enums instead of int for tokens.
Paul suggests:

   #ifndef YYTOKENTYPE
   # if defined (__STDC__) || defined (__cplusplus)
      /* Put the tokens into the symbol table, so that GDB and other debuggers
         know about them.  */
      enum yytokentype {
        FOO = 256,
        BAR,
        ...
      };
      /* POSIX requires `int' for tokens in interfaces.  */
   #  define YYTOKENTYPE int
   # endif
   #endif
   #define FOO 256
   #define BAR 257
   ...

> I'm in favor of
>
> %token FOO 256
> %token BAR 257
>
> and Bison moves error into 258.

Yes, I think that's a valid extension too, if the user doesn't define
the token number for error.

* Output directory
Akim:

| I consider this to be a bug in bison:
|
| /tmp % mkdir src
| /tmp % cp ~/src/bison/tests/calc.y src
| /tmp % mkdir build && cd build
| /tmp/build % bison ../src/calc.y
| /tmp/build % cd ..
| /tmp % ls -l build src
| build:
| total 0
|
| src:
| total 32
| -rw-r--r--    1 akim     lrde        27553 oct  2 16:31 calc.tab.c
| -rw-r--r--    1 akim     lrde         3335 oct  2 16:31 calc.y
|
|
| Would it be safe to change this behavior to something more reasonable?
| Do you think some people depend upon this?

Jim:

Is it that behavior documented?
If so, then it's probably not reasonable to change it.
I've Cc'd the automake list, because some of automake's
rules use bison through $(YACC) -- though I'll bet they
all use it in yacc-compatible mode.

Pavel:

Hello, Jim and others!

> Is it that behavior documented?
> If so, then it's probably not reasonable to change it.
> I've Cc'd the automake list, because some of automake's
> rules use bison through $(YACC) -- though I'll bet they
> all use it in yacc-compatible mode.

Yes, Automake currently used bison in Automake-compatible mode, but it
would be fair for Automake to switch to the native mode as long as the
processed files are distributed and "missing" emulates bison.

In any case, the makefiles should specify the output file explicitly
instead of relying on weird defaults.

> | src:
> | total 32
> | -rw-r--r--    1 akim     lrde        27553 oct  2 16:31 calc.tab.c
> | -rw-r--r--    1 akim     lrde         3335 oct  2 16:31 calc.y

This is not _that_ ugly as it seems - with Automake you want to put
sources where they belong - to the source directory.

> | This is not _that_ ugly as it seems - with Automake you want to put
> | sources where they belong - to the source directory.
>
> The difference source/build you are referring to is based on Automake
> concepts.  They have no sense at all for tools such as bison or gcc
> etc.  They have input and output.  I do not want them to try to grasp
> source/build.  I want them to behave uniformly: output *here*.

I realize that.

It's unfortunate that the native mode of Bison behaves in a less uniform
way than the yacc mode. I agree with your point. Bison maintainters may
want to fix it along with the documentation.


* Unit rules
Maybe we could expand unit rules, i.e., transform

	exp: arith | bool;
	arith: exp '+' exp;
	bool: exp '&' exp;

into

	exp: exp '+' exp | exp '&' exp;

when there are no actions.  This can significantly speed up some
grammars.

* Stupid error messages
An example shows it easily:

src/bison/tests % ./testsuite -k calc,location,error-verbose -l
GNU Bison 1.49a test suite test groups:

 NUM: FILENAME:LINE      TEST-GROUP-NAME
      KEYWORDS

  51: calc.at:440        Calculator --locations --yyerror-verbose
  52: calc.at:442        Calculator --defines --locations --name-prefix=calc --verbose --yacc --yyerror-verbose
  54: calc.at:445        Calculator --debug --defines --locations --name-prefix=calc --verbose --yacc --yyerror-verbose
src/bison/tests % ./testsuite 51 -d
## --------------------------- ##
## GNU Bison 1.49a test suite. ##
## --------------------------- ##
 51: calc.at:440       ok
## ---------------------------- ##
## All 1 tests were successful. ##
## ---------------------------- ##
src/bison/tests % cd ./testsuite.dir/51
tests/testsuite.dir/51 % echo "()" | ./calc
1.2-1.3: parse error, unexpected ')', expecting error or "number" or '-' or '('

* yyerror, yyprint interface
It should be improved, in particular when using Bison features such as
locations, and YYPARSE_PARAMS.  For the time being, it is recommended
to #define yyerror and yyprint to steal internal variables...

* read_pipe.c
This is not portable to DOS for instance.  Implement a more portable
scheme.  Sources of inspiration include GNU diff, and Free Recode.

* Memory leaks in the generator
A round of memory leak clean ups would be most welcome.  Dmalloc,
Checker GCC, Electric Fence, or Valgrind: you chose your tool.

* Memory leaks in the parser
The same applies to the generated parsers.  In particular, this is
critical for user data: when aborting a parsing, when handling the
error token etc., we often throw away yylval without giving a chance
of cleaning it up to the user.

* --graph
Show reductions.	[]

* Broken options ?
** %no-lines		[ok]
** %no-parser		[]
** %pure-parser		[]
** %semantic-parser	[]
** %token-table		[]
** Options which could use parse_dquoted_param ().
Maybe transfered in lex.c.
*** %skeleton		[ok]
*** %output		[]
*** %file-prefix	[]
*** %name-prefix	[]

** Skeleton strategy.	[]
Must we keep %no-parser?
	     %token-table?
*** New skeletons.	[]

* src/print_graph.c
Find the best graph parameters.	[]

* doc/bison.texinfo
** Update
informations about ERROR_VERBOSE.	[]
** Add explainations about
skeleton muscles.	[]
%skeleton.		[]

* testsuite
** tests/pure-parser.at	[]
New tests.

* Debugging parsers

From Greg McGary:

akim demaille <akim.demaille@epita.fr> writes:

> With great pleasure!  Nonetheless, things which are debatable
> (or not, but just `big') should be discuss in `public': something
> like help- or bug-bison@gnu.org is just fine.  Jesse and I are there,
> but there is also Jim and some other people.

I have no idea whether it qualifies as big or controversial, so I'll
just summarize for you.  I proposed this change years ago and was
surprised that it was met with utter indifference!

This debug feature is for the programs/grammars one develops with
bison, not for debugging bison itself.  I find that the YYDEBUG
output comes in a very inconvenient format for my purposes.
When debugging gcc, for instance, what I want is to see a trace of
the sequence of reductions and the line#s for the semantic actions
so I can follow what's happening.  Single-step in gdb doesn't cut it
because to move from one semantic action to the next takes you through
lots of internal machinery of the parser, which is uninteresting.

The change I made was to the format of the debug output, so that it
comes out in the format of C error messages, digestible by emacs
compile mode, like so:

grammar.y:1234: foo: bar(0x123456) baz(0x345678)

where "foo: bar baz" is the reduction rule, whose semantic action
appears on line 1234 of the bison grammar file grammar.y.  The hex
numbers on the rhs tokens are the parse-stack values associated with
those tokens.  Of course, yytype might be something totally
incompatible with that representation, but for the most part, yytype
values are single words (scalars or pointers).  In the case of gcc,
they're most often pointers to tree nodes.  Come to think of it, the
right thing to do is to make the printing of stack values be
user-definable.  It would also be useful to include the filename &
line# of the file being parsed, but the main filename & line# should
continue to be that of grammar.y

Anyway, this feature has saved my life on numerous occasions.  The way
I customarily use it is to first run bison with the traces on, isolate
the sequence of reductions that interests me, put those traces in a
buffer and force it into compile-mode, then visit each of those lines
in the grammar and set breakpoints with C-x SPACE.  Then, I can run
again under the control of gdb and stop at each semantic action.
With the hex addresses of tree nodes, I can inspect the values
associated with any rhs token.

You like?

* input synclines
Some users create their foo.y files, and equip them with #line.  Bison
should recognize these, and preserve them.

* BTYacc
See if we can integrate backtracking in Bison.  Contact the BTYacc
maintainers.

* Automaton report
Display more clearly the lookaheads for each item.

* RR conflicts
See if we can use precedence between rules to solve RR conflicts.  See
what POSIX says.

* Precedence
It is unfortunate that there is a total order for precedence.  It
makes it impossible to have modular precedence information.  We should
move to partial orders.

* Parsing grammars
Rewrite the reader in Bison.

* Problems with aliases
From: "Baum, Nathan I" <s0009525@chelt.ac.uk>
Subject: Token Alias Bug
To: "'bug-bison@gnu.org'" <bug-bison@gnu.org>

I've noticed a bug in bison. Sadly, our eternally wise sysadmins won't let
us use CVS, so I can't find out if it's been fixed already...

Basically, I made a program (in flex) that went through a .y file looking
for "..."-tokens, and then outputed a %token
line for it. For single-character ""-tokens, I reasoned, I could just use
[%token 'A' "A"]. However, this causes Bison to output a [#define 'A' 65],
which cppp chokes on, not unreasonably. (And even if cppp didn't choke, I
obviously wouldn't want (char)'A' to be replaced with (int)65 throughout my
code.

Bison normally forgoes outputing a #define for a character token. However,
it always outputs an aliased token -- even if the token is an alias for a
character token. We don't want that. The problem is in /output.c/, as I
recall. When it outputs the token definitions, it checks for a character
token, and then checks for an alias token. If the character token check is
placed after the alias check, then it works correctly.

Alias tokens seem to be something of a kludge. What about an [%alias "..."]
command...

	%alias T_IF "IF"

Hmm. I can't help thinking... What about a --generate-lex option that
creates an .l file for the alias tokens used... (Or an option to make a
gperf file, etc...)

* Presentation of the report file
From: "Baum, Nathan I" <s0009525@chelt.ac.uk>
Subject: Token Alias Bug
To: "'bug-bison@gnu.org'" <bug-bison@gnu.org>

I've also noticed something, that whilst not *wrong*, is inconvienient: I
use the verbose mode to help find the causes of unresolved shift/reduce
conflicts. However, this mode insists on starting the .output file with a
list of *resolved* conflicts, something I find quite useless. Might it be
possible to define a -v mode, and a -vv mode -- Where the -vv mode shows
everything, but the -v mode only tells you what you need for examining
conflicts? (Or, perhaps, a "*** This state has N conflicts ***" marker above
each state with conflicts.)


* $undefined
From Hans:
- If the Bison generated parser experiences an undefined number in the
character range, that character is written out in diagnostic messages, an
addition to the $undefined value.

Suggest: Change the name $undefined to undefined; looks better in outputs.

* Default Action
From Hans:
- For use with my C++ parser, I transported the "switch (yyn)" statement
that Bison writes to the bison.simple skeleton file. This way, I can remove
the current default rule $$ = $1 implementation, which causes a double
assignment to $$ which may not be OK under C++, replacing it with a
"default:" part within the switch statement.

Note that the default rule $$ = $1, when typed, is perfectly OK under C,
but in the C++ implementation I made, this rule is different from
$<type_name>$ = $<type_name>1. I therefore think that one should implement
a Bison option where every typed default rule is explicitly written out
(same typed ruled can of course be grouped together).

* Pre and post actions.
From: Florian Krohm <florian@edamail.fishkill.ibm.com>
Subject: YYACT_EPILOGUE
To: bug-bison@gnu.org
X-Sent: 1 week, 4 days, 14 hours, 38 minutes, 11 seconds ago

The other day I had the need for explicitly building the parse tree. I
used %locations for that and defined YYLLOC_DEFAULT to call a function
that returns the tree node for the production. Easy. But I also needed
to assign the S-attribute to the tree node. That cannot be done in
YYLLOC_DEFAULT, because it is invoked before the action is executed.
The way I solved this was to define a macro YYACT_EPILOGUE that would
be invoked after the action. For reasons of symmetry I also added
YYACT_PROLOGUE. Although I had no use for that I can envision how it
might come in handy for debugging purposes.
All is needed is to add

#if YYLSP_NEEDED
    YYACT_EPILOGUE (yyval, (yyvsp - yylen), yylen, yyloc, (yylsp - yylen));
#else
    YYACT_EPILOGUE (yyval, (yyvsp - yylen), yylen);
#endif

at the proper place to bison.simple. Ditto for YYACT_PROLOGUE.

I was wondering what you think about adding YYACT_PROLOGUE/EPILOGUE
to bison. If you're interested, I'll work on a patch.

-----

Copyright (C) 2001, 2002 Free Software Foundation, Inc.

This file is part of GNU Autoconf.

GNU Autoconf is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2, or (at your option)
any later version.

GNU Autoconf is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with autoconf; see the file COPYING.  If not, write to
the Free Software Foundation, Inc., 59 Temple Place - Suite 330,
Boston, MA 02111-1307, USA.
Commit	Line	Data
416bd7a9 MA	1	-- outline --
416bd7a9 MA	2
76551463 AD	3
	4	* URGENT: Prologue
	5	The %union is declared after the user C declarations. It can be
	6	a problem if YYSTYPE is declared after the user part.
	7
	8	Actually, the real problem seems that the %union ought to be output
	9	where it was defined. For instance, in gettext/intl/plural.y, we
	10	have:
	11
	12	%{
	13	...
	14	#include "gettextP.h"
	15	...
	16	%}
	17
	18	%union {
	19	unsigned long int num;
	20	enum operator op;
	21	struct expression *exp;
	22	}
	23
	24	%{
	25	...
	26	static int yylex PARAMS ((YYSTYPE lval, const char *pexp));
	27	...
	28	%}
	29
	30	Where the first part defines struct expression, the second uses it to
	31	define YYSTYPE, and the last uses YYSTYPE. Only this order is valid.
	32
	33	Note that we have the same problem with GCC.
	34
	35	I suggest splitting the prologue into pre-prologue and post-prologue.
	36	The reason is that:
	37
	38	1. we keep language independance as it is the skeleton that joins the
	39	two prologues (there is no need for the engine to encode union yystype
	40	and to output it inside the prologue, which breaks the language
	41	independance of the generator)
	42
	43	2. that makes it possible to have several %union in input. I think
	44	this is a pleasant (but useless currently) feature, but in the future,
	45	I want a means to %include other bits of grammars, and _then_ it will
	46	be important for the various bits to define their needs in %union.
	47
5c0a0514 AD	48	When implementing multiple-%union support, bare the following in mind:
	49
	50	- when --yacc, this must be flagged as an error. Don't make it fatal
	51	though.
	52
	53	- The #line must now appear inside the definition of yystype.
	54	Something like
	55
	56	{
	57	#line 12 "foo.y"
	58	int ival;
	59	#line 23 "foo.y"
	60	char *sval;
	61	}
	62
b4cbf822 AD	63	* Language independent actions
	64
	65	Currently bison, the generator, transforms $1, $$ and so forth into
	66	direct C code, manipulating the stacks. This is problematic, because
	67	(i) it means that if we want more languages, we need to update the
	68	generator, and (ii), it forces names everywhere (e.g., the C++
	69	skeleton would be happy to use other naming schemes, and actually,
	70	even other accessing schemes).
	71
	72	Therefore we want
	73
	74	1. the generator to replace $1, etc. by M4 macro invocations
	75	(b4_dollar(1), b4_at(3), b4_dollar_dollar) etc.
	76
	77	2. the skeletons to define these macros.
	78
	79	But currently the actions are double-quoted, to protect them from M4
	80	evaluation. So we need to:
	81
	82	3. stop quoting them
	83
	84	4. change the [ and ] in the actions into @<:@ and @:>@
	85
	86	5. extend the postprocessor to maps these back onto [ and ].
	87
eaff5ee3	88	* Coding system independence
4358321a	89	Paul notes:
eaff5ee3 AD	90
	91	Currently Bison assumes 8-bit bytes (i.e. that UCHAR_MAX is
	92	255). It also assumes that the 8-bit character encoding is
	93	the same for the invocation of 'bison' as it is for the
	94	invocation of 'cc', but this is not necessarily true when
	95	people run bison on an ASCII host and then use cc on an EBCDIC
	96	host. I don't think these topics are worth our time
	97	addressing (unless we find a gung-ho volunteer for EBCDIC or
	98	PDP-10 ports :-) but they should probably be documented
	99	somewhere.
	100
	101	* Using enums instead of int for tokens.
	102	Paul suggests:
	103
	104	#ifndef YYTOKENTYPE
	105	# if defined (__STDC__) \|\| defined (__cplusplus)
	106	/* Put the tokens into the symbol table, so that GDB and other debuggers
	107	know about them. */
	108	enum yytokentype {
	109	FOO = 256,
	110	BAR,
	111	...
	112	};
	113	/* POSIX requires `int' for tokens in interfaces. */
	114	# define YYTOKENTYPE int
	115	# endif
	116	#endif
	117	#define FOO 256
	118	#define BAR 257
	119	...
	120
4358321a AD	121	> I'm in favor of
	122	>
	123	> %token FOO 256
	124	> %token BAR 257
	125	>
	126	> and Bison moves error into 258.
	127
	128	Yes, I think that's a valid extension too, if the user doesn't define
	129	the token number for error.
	130
8b3ba7ff AD	131	* Output directory
	132	Akim:
	133
	134	\| I consider this to be a bug in bison:
	135	\|
	136	\| /tmp % mkdir src
	137	\| /tmp % cp ~/src/bison/tests/calc.y src
	138	\| /tmp % mkdir build && cd build
	139	\| /tmp/build % bison ../src/calc.y
	140	\| /tmp/build % cd ..
	141	\| /tmp % ls -l build src
	142	\| build:
	143	\| total 0
	144	\|
	145	\| src:
	146	\| total 32
	147	\| -rw-r--r-- 1 akim lrde 27553 oct 2 16:31 calc.tab.c
	148	\| -rw-r--r-- 1 akim lrde 3335 oct 2 16:31 calc.y
	149	\|
	150	\|
	151	\| Would it be safe to change this behavior to something more reasonable?
	152	\| Do you think some people depend upon this?
	153
	154	Jim:
	155
	156	Is it that behavior documented?
	157	If so, then it's probably not reasonable to change it.
	158	I've Cc'd the automake list, because some of automake's
	159	rules use bison through $(YACC) -- though I'll bet they
	160	all use it in yacc-compatible mode.
	161
	162	Pavel:
	163
	164	Hello, Jim and others!
	165
	166	> Is it that behavior documented?
	167	> If so, then it's probably not reasonable to change it.
	168	> I've Cc'd the automake list, because some of automake's
	169	> rules use bison through $(YACC) -- though I'll bet they
	170	> all use it in yacc-compatible mode.
	171
	172	Yes, Automake currently used bison in Automake-compatible mode, but it
	173	would be fair for Automake to switch to the native mode as long as the
	174	processed files are distributed and "missing" emulates bison.
	175
	176	In any case, the makefiles should specify the output file explicitly
	177	instead of relying on weird defaults.
	178
	179	> \| src:
	180	> \| total 32
	181	> \| -rw-r--r-- 1 akim lrde 27553 oct 2 16:31 calc.tab.c
	182	> \| -rw-r--r-- 1 akim lrde 3335 oct 2 16:31 calc.y
	183
	184	This is not _that_ ugly as it seems - with Automake you want to put
	185	sources where they belong - to the source directory.
	186
	187	> \| This is not _that_ ugly as it seems - with Automake you want to put
	188	> \| sources where they belong - to the source directory.
	189	>
	190	> The difference source/build you are referring to is based on Automake
	191	> concepts. They have no sense at all for tools such as bison or gcc
	192	> etc. They have input and output. I do not want them to try to grasp
	193	> source/build. I want them to behave uniformly: output here.
	194
195	I realize that.
196
197	It's unfortunate that the native mode of Bison behaves in a less uniform
198	way than the yacc mode. I agree with your point. Bison maintainters may
199	want to fix it along with the documentation.
200
201
fa770c86 AD	202	* Unit rules
	203	Maybe we could expand unit rules, i.e., transform
	204
	205	exp: arith \| bool;
	206	arith: exp '+' exp;
	207	bool: exp '&' exp;
	208
	209	into
	210
	211	exp: exp '+' exp \| exp '&' exp;
	212
	213	when there are no actions. This can significantly speed up some
	214	grammars.
	215
51dec47b AD	216	* Stupid error messages
	217	An example shows it easily:
	218
	219	src/bison/tests % ./testsuite -k calc,location,error-verbose -l
	220	GNU Bison 1.49a test suite test groups:
	221
	222	NUM: FILENAME:LINE TEST-GROUP-NAME
	223	KEYWORDS
	224
	225	51: calc.at:440 Calculator --locations --yyerror-verbose
	226	52: calc.at:442 Calculator --defines --locations --name-prefix=calc --verbose --yacc --yyerror-verbose
	227	54: calc.at:445 Calculator --debug --defines --locations --name-prefix=calc --verbose --yacc --yyerror-verbose
	228	src/bison/tests % ./testsuite 51 -d
	229	## --------------------------- ##
	230	## GNU Bison 1.49a test suite. ##
	231	## --------------------------- ##
	232	51: calc.at:440 ok
	233	## ---------------------------- ##
	234	## All 1 tests were successful. ##
	235	## ---------------------------- ##
	236	src/bison/tests % cd ./testsuite.dir/51
	237	tests/testsuite.dir/51 % echo "()" \| ./calc
	238	1.2-1.3: parse error, unexpected ')', expecting error or "number" or '-' or '('
fa770c86	239
01c56de4 AD	240	* yyerror, yyprint interface
	241	It should be improved, in particular when using Bison features such as
	242	locations, and YYPARSE_PARAMS. For the time being, it is recommended
	243	to #define yyerror and yyprint to steal internal variables...
	244
fa770c86 AD	245	* read_pipe.c
	246	This is not portable to DOS for instance. Implement a more portable
	247	scheme. Sources of inspiration include GNU diff, and Free Recode.
	248
aef1ffd5 AD	249	* Memory leaks in the generator
	250	A round of memory leak clean ups would be most welcome. Dmalloc,
	251	Checker GCC, Electric Fence, or Valgrind: you chose your tool.
	252
	253	* Memory leaks in the parser
	254	The same applies to the generated parsers. In particular, this is
	255	critical for user data: when aborting a parsing, when handling the
	256	error token etc., we often throw away yylval without giving a chance
	257	of cleaning it up to the user.
	258
bcb05e75 MA	259	* --graph
	260	Show reductions. []
	261
704a47c4	262	* Broken options ?
c3995d99	263	** %no-lines [ok]
04a76783	264	** %no-parser []
fbbf9b3b	265	** %pure-parser []
04a76783 MA	266	** %semantic-parser []
	267	** %token-table []
	268	** Options which could use parse_dquoted_param ().
	269	Maybe transfered in lex.c.
	270	*** %skeleton [ok]
	271	*** %output []
	272	*** %file-prefix []
	273	*** %name-prefix []
ec93a213	274
fbbf9b3b	275	** Skeleton strategy. []
c3a8cbaa MA	276	Must we keep %no-parser?
c3a8cbaa MA	277	%token-table?
fbbf9b3b	278	*** New skeletons. []
416bd7a9	279
c111e171	280	* src/print_graph.c
31b53af2	281	Find the best graph parameters. []
63c2d5de MA	282
63c2d5de MA	283	* doc/bison.texinfo
1a4648ff	284	** Update
c3a8cbaa	285	informations about ERROR_VERBOSE. []
1a4648ff	286	** Add explainations about
c3a8cbaa MA	287	skeleton muscles. []
c3a8cbaa MA	288	%skeleton. []
eeeb962b	289
704a47c4	290	* testsuite
c3a8cbaa MA	291	** tests/pure-parser.at []
c3a8cbaa MA	292	New tests.
0f8d586a AD	293
	294	* Debugging parsers
	295
	296	From Greg McGary:
	297
	298	akim demaille <akim.demaille@epita.fr> writes:
	299
	300	> With great pleasure! Nonetheless, things which are debatable
	301	> (or not, but just `big') should be discuss in `public': something
	302	> like help- or bug-bison@gnu.org is just fine. Jesse and I are there,
	303	> but there is also Jim and some other people.
	304
	305	I have no idea whether it qualifies as big or controversial, so I'll
	306	just summarize for you. I proposed this change years ago and was
	307	surprised that it was met with utter indifference!
	308
	309	This debug feature is for the programs/grammars one develops with
	310	bison, not for debugging bison itself. I find that the YYDEBUG
	311	output comes in a very inconvenient format for my purposes.
	312	When debugging gcc, for instance, what I want is to see a trace of
	313	the sequence of reductions and the line#s for the semantic actions
	314	so I can follow what's happening. Single-step in gdb doesn't cut it
	315	because to move from one semantic action to the next takes you through
	316	lots of internal machinery of the parser, which is uninteresting.
	317
	318	The change I made was to the format of the debug output, so that it
	319	comes out in the format of C error messages, digestible by emacs
	320	compile mode, like so:
	321
	322	grammar.y:1234: foo: bar(0x123456) baz(0x345678)
	323
	324	where "foo: bar baz" is the reduction rule, whose semantic action
	325	appears on line 1234 of the bison grammar file grammar.y. The hex
	326	numbers on the rhs tokens are the parse-stack values associated with
	327	those tokens. Of course, yytype might be something totally
	328	incompatible with that representation, but for the most part, yytype
	329	values are single words (scalars or pointers). In the case of gcc,
	330	they're most often pointers to tree nodes. Come to think of it, the
	331	right thing to do is to make the printing of stack values be
	332	user-definable. It would also be useful to include the filename &
	333	line# of the file being parsed, but the main filename & line# should
	334	continue to be that of grammar.y
	335
	336	Anyway, this feature has saved my life on numerous occasions. The way
	337	I customarily use it is to first run bison with the traces on, isolate
	338	the sequence of reductions that interests me, put those traces in a
	339	buffer and force it into compile-mode, then visit each of those lines
	340	in the grammar and set breakpoints with C-x SPACE. Then, I can run
	341	again under the control of gdb and stop at each semantic action.
	342	With the hex addresses of tree nodes, I can inspect the values
	343	associated with any rhs token.
	344
	345	You like?
cd6a695e AD	346
	347	* input synclines
	348	Some users create their foo.y files, and equip them with #line. Bison
	349	should recognize these, and preserve them.
0e95c1dd AD	350
	351	* BTYacc
	352	See if we can integrate backtracking in Bison. Contact the BTYacc
	353	maintainers.
	354
	355	* Automaton report
	356	Display more clearly the lookaheads for each item.
	357
	358	* RR conflicts
	359	See if we can use precedence between rules to solve RR conflicts. See
	360	what POSIX says.
	361
	362	* Precedence
	363	It is unfortunate that there is a total order for precedence. It
	364	makes it impossible to have modular precedence information. We should
	365	move to partial orders.
	366
	367	* Parsing grammars
	368	Rewrite the reader in Bison.
f294a2c2	369
20c37f21 AD	370	* Problems with aliases
	371	From: "Baum, Nathan I" <s0009525@chelt.ac.uk>
	372	Subject: Token Alias Bug
	373	To: "'bug-bison@gnu.org'" <bug-bison@gnu.org>
	374
	375	I've noticed a bug in bison. Sadly, our eternally wise sysadmins won't let
	376	us use CVS, so I can't find out if it's been fixed already...
	377
	378	Basically, I made a program (in flex) that went through a .y file looking
	379	for "..."-tokens, and then outputed a %token
	380	line for it. For single-character ""-tokens, I reasoned, I could just use
	381	[%token 'A' "A"]. However, this causes Bison to output a [#define 'A' 65],
	382	which cppp chokes on, not unreasonably. (And even if cppp didn't choke, I
	383	obviously wouldn't want (char)'A' to be replaced with (int)65 throughout my
	384	code.
	385
	386	Bison normally forgoes outputing a #define for a character token. However,
	387	it always outputs an aliased token -- even if the token is an alias for a
	388	character token. We don't want that. The problem is in /output.c/, as I
	389	recall. When it outputs the token definitions, it checks for a character
	390	token, and then checks for an alias token. If the character token check is
	391	placed after the alias check, then it works correctly.
	392
	393	Alias tokens seem to be something of a kludge. What about an [%alias "..."]
	394	command...
	395
	396	%alias T_IF "IF"
	397
	398	Hmm. I can't help thinking... What about a --generate-lex option that
	399	creates an .l file for the alias tokens used... (Or an option to make a
	400	gperf file, etc...)
	401
	402	* Presentation of the report file
	403	From: "Baum, Nathan I" <s0009525@chelt.ac.uk>
	404	Subject: Token Alias Bug
	405	To: "'bug-bison@gnu.org'" <bug-bison@gnu.org>
	406
	407	I've also noticed something, that whilst not wrong, is inconvienient: I
	408	use the verbose mode to help find the causes of unresolved shift/reduce
	409	conflicts. However, this mode insists on starting the .output file with a
	410	list of resolved conflicts, something I find quite useless. Might it be
	411	possible to define a -v mode, and a -vv mode -- Where the -vv mode shows
	412	everything, but the -v mode only tells you what you need for examining
	413	conflicts? (Or, perhaps, a "* This state has N conflicts *" marker above
	414	each state with conflicts.)
	415
	416
69991a58 AD	417	* $undefined
	418	From Hans:
	419	- If the Bison generated parser experiences an undefined number in the
	420	character range, that character is written out in diagnostic messages, an
	421	addition to the $undefined value.
	422
	423	Suggest: Change the name $undefined to undefined; looks better in outputs.
	424
	425	* Default Action
	426	From Hans:
	427	- For use with my C++ parser, I transported the "switch (yyn)" statement
	428	that Bison writes to the bison.simple skeleton file. This way, I can remove
	429	the current default rule $$ = $1 implementation, which causes a double
	430	assignment to $$ which may not be OK under C++, replacing it with a
	431	"default:" part within the switch statement.
	432
	433	Note that the default rule $$ = $1, when typed, is perfectly OK under C,
	434	but in the C++ implementation I made, this rule is different from
	435	$<type_name>$ = $<type_name>1. I therefore think that one should implement
	436	a Bison option where every typed default rule is explicitly written out
	437	(same typed ruled can of course be grouped together).
	438
	439	* Pre and post actions.
	440	From: Florian Krohm <florian@edamail.fishkill.ibm.com>
	441	Subject: YYACT_EPILOGUE
	442	To: bug-bison@gnu.org
	443	X-Sent: 1 week, 4 days, 14 hours, 38 minutes, 11 seconds ago
	444
	445	The other day I had the need for explicitly building the parse tree. I
	446	used %locations for that and defined YYLLOC_DEFAULT to call a function
	447	that returns the tree node for the production. Easy. But I also needed
	448	to assign the S-attribute to the tree node. That cannot be done in
	449	YYLLOC_DEFAULT, because it is invoked before the action is executed.
	450	The way I solved this was to define a macro YYACT_EPILOGUE that would
	451	be invoked after the action. For reasons of symmetry I also added
	452	YYACT_PROLOGUE. Although I had no use for that I can envision how it
	453	might come in handy for debugging purposes.
76551463	454	All is needed is to add
69991a58 AD	455
	456	#if YYLSP_NEEDED
	457	YYACT_EPILOGUE (yyval, (yyvsp - yylen), yylen, yyloc, (yylsp - yylen));
	458	#else
	459	YYACT_EPILOGUE (yyval, (yyvsp - yylen), yylen);
	460	#endif
	461
	462	at the proper place to bison.simple. Ditto for YYACT_PROLOGUE.
	463
	464	I was wondering what you think about adding YYACT_PROLOGUE/EPILOGUE
	465	to bison. If you're interested, I'll work on a patch.
	466
f294a2c2 AD	467	-----
	468
	469	Copyright (C) 2001, 2002 Free Software Foundation, Inc.
	470
	471	This file is part of GNU Autoconf.
	472
	473	GNU Autoconf is free software; you can redistribute it and/or modify
	474	it under the terms of the GNU General Public License as published by
	475	the Free Software Foundation; either version 2, or (at your option)
	476	any later version.
	477
	478	GNU Autoconf is distributed in the hope that it will be useful,
	479	but WITHOUT ANY WARRANTY; without even the implied warranty of
	480	MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
	481	GNU General Public License for more details.
	482
	483	You should have received a copy of the GNU General Public License
	484	along with autoconf; see the file COPYING. If not, write to
	485	the Free Software Foundation, Inc., 59 Temple Place - Suite 330,
	486	Boston, MA 02111-1307, USA.