Akim Demaille [Sat, 16 Aug 2008 13:29:30 +0000 (15:29 +0200)]
Avoid trailing spaces.
* data/c.m4: b4_comment(TEXT): Don't indent empty lines.
* data/lalr1.cc: Don't indent before rule and symbol actions, as
they can be empty, and anyway this incorrectly indents the first
action.
Akim Demaille [Wed, 13 Aug 2008 11:18:22 +0000 (13:18 +0200)]
Adjust verbose message to using emacs.
* etc/bench.pl.in: Inform compilation-mode when we change the
directory.
(generate_grammar_list): Recognize %define "variant" in addition
to %define variant.
Akim Demaille [Tue, 12 Aug 2008 19:48:53 +0000 (21:48 +0200)]
Change the handling of the symbols in the skeletons.
Before we were using tables which lines were the symbols and which
columns were things like number, tag, type-name etc. It is was
difficult to extend: each time a column was added, all the numbers had
to be updated (you asked for colon $2, not for "tag"). Also, it was
hard to filter these tables when only a subset of the symbols (say the
tokens, or the nterms, or the tokens that have and external number
*and* a type-name) was of interest.
Now instead of monolithic tables, we define one macro per cell. For
instance "b4_symbol(0, tag)" is a macro name which contents is
self-decriptive. The macro "b4_symbol" provides easier access to
these cells.
Akim Demaille [Thu, 7 Aug 2008 21:15:34 +0000 (23:15 +0200)]
Add %precedence support.
Unfortunately it is not possible to reuse the %prec directive. This
is because to please POSIX, we do not require to end the rules with a
semicolon. As a result,
foo: bar %prec baz
is ambiguous: either a rule which precedence is that of baz, or a rule,
and then a declaration of the precedence of the token baz.
* doc/bison.texinfo: Document %precedence.
(Precedence Only): New.
* src/assoc.h, src/assoc.c (precedence_assoc): New.
* src/conflicts.c (resolve_sr_conflict): Support it.
* src/scan-gram.l, src/parse-gram.y (%precedence): New token.
Parse it.
* tests/calc.at: Use %precedence for NEG.
* tests/conflicts.at (%precedence does not suffice)
(%precedence suffices): New tests.
Akim Demaille [Sat, 2 Aug 2008 20:06:49 +0000 (22:06 +0200)]
bench.pl -d, --directive.
* etc/bench.pl.in (@directive): New.
(&bench_grammar): Use it.
(&bench_list_grammar): New, to provide access to the "variant"
grammar.
Use it.
(getopts): Support -d, --directive.
Akim Demaille [Sat, 2 Aug 2008 19:42:48 +0000 (21:42 +0200)]
Introduce a hierarchy for symbols.
* data/lalr1.cc (symbol_base_type, symbol_type): New.
(data_type): Rename as...
(stack_symbol_type): this.
Derive from symbol_base_type.
(yy_symbol_value_print_): Merge into...
(yy_symbol_print_): this.
Rename as...
(yy_print_): this.
(yydestruct_): Rename as...
(yy_destroy_): this.
(b4_symbols_actions, YY_SYMBOL_PRINT): Adjust.
(parser::parse): yyla is now of symbol_type.
Use its type member instead of yytoken.
Akim Demaille [Sat, 2 Aug 2008 12:18:48 +0000 (14:18 +0200)]
Handle semantic value and location together.
* data/lalr1.cc (b4_symbol_actions): Bounce $$ and @$ to
yydata.value and yydata.location.
(yy_symbol_value_print_, yy_symbol_print_, yydestruct_)
(YY_SYMBOL_PRINT): Now take semantic value and location as a
single arg.
Adjust all callers.
(yydestruct_): New overload for a stack symbol.
Rely on the state stack to display reduction traces.
To display rhs symbols before a reduction, we used information about the rule
reduced, which required the tables yyrhs and yyprhs. Now use rely only on the
state stack to get the same information.
* data/lalr1.cc (b4_rhs_data, b4_rhs_state): New.
Use them.
(parser::yyrhs_, parser::yyprhs_): Remove.
(parser::yy_reduce_print_): Use the state stack.
* data/lalr1.cc (b4_lhs_value, b4_lhs_location): Adjust to using
yylhs.
(parse): Replace yyval and yyloc with yylhs.value and
yylhs.location.
After a user action, compute yylhs.state earlier.
(yyerrlab1): Do not play tricks with yylhs.location, rather, use a
fresh error_token.
Joel E. Denny [Fri, 7 Nov 2008 22:20:44 +0000 (17:20 -0500)]
Clean up %skeleton and %language priority implementation.
* src/getargs.c (skeleton_prio): Use default_prio rather than 2, and
remove static qualifier because others will soon need to see it.
(language_prio): Likewise.
(getargs): Use command_line_prio rather than 0.
* src/getargs.h (command_line_prio, grammar_prio, default_prio): New
enum fields.
(skeleton_prio): Extern it.
(language_prio): Extern it.
* src/parse-gram.y: Use grammar_prio rather than 1.
Pass command line location to skeleton_arg and language_argmatch.
* src/getargs.h, src/getargs.c (skeleton_arg, language_argmatch):
The location argument is now mandatory.
Adjust all dependencies.
(getargs): Use command_line_location.
Initialize the muscle table before parsing the command line.
* src/getargs.c (quotearg.h, muscle_tab.h): Include.
(getargs): Define file_name.
* src/main.c (main): Initialize muscle_tab before calling
getargs.
* src/muscle_tab.c (muscle_init): No longer define file_name, as
its value is not available yet.
* build-aux/cross-options.pl: The argument ends at the first
space, not the first non-symbol character.
Use @var for each word appearing the argument description.
Destroy the variants that remain on the stack in case of error.
* data/lalr1-fusion.cc (yydestruct_): Invoke the variant's
destructor.
Display the value only if yymsg is nonnull.
(yyreduce): Invoke yydestruct_ when popping lhs symbols.
This is used to help the user catch cases where some value gets
ovewritten by a new one. This should not happen, as this will
probably leak.
Unfortunately this uncovered a bug in the C++ parser itself: the
lookahead value was not destroyed between two calls to yylex. For
instance if the previous lookahead was a std::string, and then an int,
then the value of the std::string was correctly taken (i.e., the
lookahead was now an empty string), but std::string structure itself
was not reclaimed.
This is now done in variant::build(other&) (which is used to take the
value of the lookahead): other is not only stolen from its value, it
is also destroyed. This incurs a new performance penalty of a few
percent, and union becomes faster again.
* data/lalr1-fusion.cc (variant::build(other&)): Destroy other.
(b4_variant_if): New.
(variant::built): New.
Use it whereever the status of the variant changes.
* etc/bench.pl.in: Check the penalty of %define assert.
Joel E. Denny [Tue, 4 Nov 2008 20:03:00 +0000 (15:03 -0500)]
Fix user actions without a trailing semicolon.
Reported by Sergei Steshenko at
<http://lists.gnu.org/archive/html/bug-bison/2008-11/msg00001.html>.
* THANKS (Sergei Steshenko): Add.
* src/scan-code.l (SC_RULE_ACTION): Fix it.
* tests/regression.at (Fix user actions without a trailing semicolon):
New test case.
* etc/bench.pl.in (&run, &generate_grammar): New.
Rename the grammar generating functions for consistency.
Change the interface so that the list of benches to run is passed
as (optionless) arguments.
(&compile): Use &run.
* data/lalr1-fusion.cc (b4_symbol_variant): Adjust additional
arguments.
(variant::build): New overload for
copy-construction-that-destroys.
(variant::swap): New.
(parser::yypush_): Use it in variant mode.
* data/bison.m4 (b4_copyright): Fix the indentation of the
copyright year paragraph.
Use b4_copyright_years when no years are given.
* data/lalr1.cc, data/lalr1-fusion.cc, data/location.cc
(b4_copyright_years): New.
Use it.
* doc/bison.texinfo (calc++.cc): Propagate failures to the exit
status.
* examples/calc++/test ($me, $number, $exit, run): New.
Use them to propagate errors to the exit status.
* etc/bench.pl.in (variant_grammar): Fix the computation of
$variant.
Generate a grammar file that can work with or without %debug.
Do use the @directive.
(bench_variant_parser): Check impact of %debug.
(@directives): Rename all the occurrences to...
(@directive): this, for consistency.
* etc/bench.pl.in: More doc.
Some fixes in the documentation.
($cflags, $iterations, &help, &getopt): New.
Use them.
(&variant_grammar): Let the number of stages be 10 times what is
specified.
* etc/bench.pl.in ($cxx, &variant_grammar, &bench_variant_parser):
New.
(&compile): Be ready to compile C++ parsers.
(&bench_push_parser): Move debug information to the outermost
level.
* THANKS: Add Michiel De Wilde.
bench.pl: Pass directives as a list instead of as a string.
* etc/bench.pl.in (&directives): New.
(&triangular_grammar, &calc_grammar): Use it to format the Bison
directives.
(&triangular_grammar): Do use the directives (were ignored).
(&bench_grammar, &bench_push_parser): Adjust to pass lists of
directives.
Akim Demaille [Wed, 22 Oct 2008 10:25:11 +0000 (05:25 -0500)]
Fuse the three stacks into a single one.
In order to make it easy to perform benchmarks to ensure that there are no
performance loss, lalr1.cc is forked into lalr1-fusion.cc. Eventually,
lalr1-fusion.cc will replace lalr1.cc.
Meanwhile, to make sure that lalr1-fusion.cc is correctly exercized by the
test suite, the user must install a symbolic link from lalr1.cc to it.
Instead of having three stacks (state, value, location), use a stack
of triples. This considerably simplifies the code (and it will be
easier not to require locations as currently does the C++ parser),
and also gives a 10% speedup according to etc/bench (probably mainly since
memory allocation is done once instead of three times).
Another motivation is to make it easier to destruct properly
semantic values: now that they are bound to their state (hence
symbol type) it will be easier to call the appropriate destructor.
These changes should probably benefit the C parser too.
* data/lalr1.cc: Copy as... * data/lalr1-fusion.cc: this new
file.
(b4_rhs_value, b4_rhs_location): New definitions overriding those
from c++.m4.
(state_stack_type, semantic_stack_type, location_stack_type)
(yystate_stack_, yysemantic_stack_, yylocation_stack_): Remove.
(data_type, stack_type, yystack_): New.
(YYLLOC_DEFAULT, yypush_): Adjust.
(yyerror_range): Now based on data_type, not location_type.
Akim Demaille [Wed, 22 Oct 2008 10:17:07 +0000 (05:17 -0500)]
Push the state, value, and location at the same time.
This is needed to prepare a forthcoming patch that fuses the three
stacks into one.
* data/lalr1.cc (parser::yypush_): New.
(parser::yynewstate): Change the semantics: instead of arriving to
this label when value and location have been pushed, but yystate
is to be pushed on the state stack, now the three of them must
have been pushed before. yystate still must be the new state.
This allows to use yypush_ everywhere instead of individual
handling of the stacks.
Akim Demaille [Wed, 22 Oct 2008 09:16:34 +0000 (04:16 -0500)]
Prefer references to pointers.
* data/lalr1.cc (b4_symbol_actions): New, overrides the default C
definition to use references instead of pointers.
(yy_symbol_value_print_, yy_symbol_print_, yydestruct_):
Take the value and location as references.
Adjust callers.
Akim Demaille [Tue, 21 Oct 2008 23:00:29 +0000 (18:00 -0500)]
Use variants to support objects as semantic values.
This patch was inspired by work by Michiel De Wilde. But he used Boost
variants which (i) requires Boost on the user side, (ii) is slow, and
(iii) has useless overhead (the parser knows the type of the semantic value
there is no reason to duplicate this information as Boost.Variants do).
This implementation reserves a buffer large enough to store the largest
objects. yy::variant implements this buffer. It was implemented with
Quentin Hocquet.
* src/output.c (type_names_output): New.
(output_skeleton): Invoke it.
* data/c++.m4 (b4_variant_if): New.
(b4_symbol_value): If needed, provide a definition for variants.
* data/lalr1.cc (b4_symbol_value, b4_symbol_action_)
(b4_symbol_variant, _b4_char_sizeof_counter, _b4_char_sizeof_dummy)
(b4_char_sizeof, yy::variant): New.
(parser::parse): If variants are requested, define
parser::union_type, parser::variant, change the definition of
semantic_type, construct $$ before running the user action instead
of performing a default $$ = $1.
* examples/variant.yy: New.
Based on an example by Michiel De Wilde.
Joel E. Denny [Sun, 2 Nov 2008 21:55:14 +0000 (16:55 -0500)]
Prepare for next release.
* NEWS: Briefly mention changes since 2.3b.
* README: Say GNU m4 1.4.6, which we've been requiring in release
announcements already, not 1.4.3, which breaks the build.
Joel E. Denny [Sun, 2 Nov 2008 21:54:45 +0000 (16:54 -0500)]
Say %language is experimental.
We're thinking of extending it's effect on output file naming. See the
thread at
<http://lists.gnu.org/archive/html/bison-patches/2008-10/msg00003.html>.
* NEWS: Say it's experimental.
* doc/bison.texinfo (Decl Summary): Say it's experimental, and so don't
recommend it over %skeleton for now.
(Bison Options): Likewise.
(C++ Bison Interface): Use %skeleton not %language.
(Calc++ Parser): Use %skeleton not %language.
* src/getargs.c (usage): Say it's experimental.