Akim Demaille [Fri, 1 Feb 2013 16:52:01 +0000 (17:52 +0100)]
style: use a for loop instead of a while loop, and scope reduction
* src/reader.c (packgram): Improve readability.
The parser calls grammar_current_rule_end at the end of every rhs,
which adds a NULL to separate the rules. So there is no need to
check whether "p" is non-null before proceeding.
* tests/c++.at (C++ Variant-based Symbol, Variants): Here. Rename the
generated input files to use .y instead of .yy, as a requirement for using
AT_FULL_COMPILE instead of a combination of AT_BISON_CHECK and
AT_BISON_COMPILE_CXX.
This is based on what is recommended by both Scott Meyers, in 'Effective
C++', and Andrei Alexandrescu and Herb Sutter in 'C++ Coding Standards'.
Use a static_cast on void* rather than directly use a reinterpret_cast,
which can have nefarious effects on objects. However, even though following
this guideline is good practice in general, I am not quite sure how relevant
it is when applied to conversions from POD to objects. Actually, it might
very well be the opposite: isn't this exactly what reinterpret_cast is for?
What we really want *is* to transmit the memory map as a series of bytes,
which, if I am correct, falls into the kind of "low level" hack for which
this cast is meant.
In any case, this silences the warning, which will be greatly appreciated by
anyone using variants with a compiler supporting -fstrict-aliasing.
* data/variant.hh (as): Here.
* tests/c++.at (Exception safety, C++ Variant-based Symbols, Variants):
Don't use NO_STRICT_ALIAS_CXXFLAGS (revert commit ddb9db15), as type punning
is no longer an issue.
* tests/atlocal.in, configure.ac (NO_STRICT_ALIAS_CXXFLAGS): Remove
definition.
* examples/local.mk (NO_STRICT_ALIAS_CXXFLAGS): Remove from AM_CXXFLAGS.
* doc/bison.texi: Don't mention type punning issues.
Reformulate and give more details on my thoughts concerning the graphical
visualization, and add an entry about a bug in the options processing for
warnings as errors.
Akim Demaille [Fri, 1 Feb 2013 13:24:48 +0000 (14:24 +0100)]
location: pass the location first
* src/location.h, src/location.c (location_print): For consistency
with other data structures and other location_* routines, pass the
location argument first.
* src/complain.c: Adjust.
(location_caret): Likewise.
* src/parse-gram.y: Adjust.
Valentin Tolmer [Wed, 30 Jan 2013 10:30:15 +0000 (11:30 +0100)]
warnings: introduce -Wprecedence
The new warning category "precedence" flags useless precedence and
associativity. -Wprecedence can now be used, it is disabled by default.
The warnings about precedence and associativity are grouped into one, and
the testsuite was corrected accordingly.
* src/complain.h (warnings): Introduce "precedence".
* src/complain.c (warnings_print_categories): Adjust.
* src/getargs.c (warnings_args, warning_types): Likewise.
* src/symtab.h, src/symtab.c (print_associativity_warnings): Remove.
* src/symtab.h (register_assoc): Correct arguments.
* src/symtab.c (print_precedence_warnings): Print both warnings together.
* doc/bison.texi (Bison options): Document the warnings and provide an
example.
* tests/conflicts.at, tests/existing.at, tests/local.at,
* tests/regression.at: Adapt the testsuite for the new category
(-Wprecedence instead of -Wother where appropriate).
Akim Demaille [Wed, 30 Jan 2013 14:52:34 +0000 (15:52 +0100)]
build: avoid clang's colored diagnostics in the test suite
The syncline tests, which try to recognize compiler diagnostics,
are confused by escapes for colors.
* configure.ac (warn_tests): New, to factor the warnings for both
C and C++ tests.
Add -fno-color-diagnostics to it.
* tests/local.at (AT_TEST_TABLES_AND_PARSE): Do not remove glue
together compiler flags.
Akim Demaille [Wed, 30 Jan 2013 14:28:08 +0000 (15:28 +0100)]
build: please Clang++ 3.2+ on Flex scanners
Clang++, with -Wall, rejects code generated by Flex (for C scanners):
CXX examples/calc++/examples_calc___calc__-calc++-scanner.o
In file included from examples/calc++/calc++-scanner.cc:1:
error: implicit conversion of NULL constant to 'bool' [-Werror,-Wnull-conversion]
if ( ! ( (yy_buffer_stack) ? (yy_buffer_stack)[(yy_buffer_stack_top)] : __null) ) {
~ ^~~~~~
false
* configure.ac (WARN_NO_NULL_CONVERSION_CXXFLAGS): Compute it.
* examples/calc++/local.mk (examples_calc___calc___CXXFLAGS): Use it.
Valentin Tolmer [Tue, 29 Jan 2013 15:27:04 +0000 (16:27 +0100)]
grammar: record used associativity and print useless ones
Record which symbol associativity is used, and display useless ones.
* src/symtab.h, src/symtab.c (register_assoc, print_assoc_warnings): New
* src/symtab.c (init_assoc, is_assoc_used): New
* src/main.c: Use print_assoc_warnings
* src/conflicts.c: Use register_assoc
* tests/conflicts.at (Useless associativity warning): New.
Due to the new warning, many tests had to be updated.
* tests/conflicts.at tests/existing.at tests/regression.at:
Add the associativity warning in the expected results.
* tests/java.at: Fix the java calculator's grammar to remove a useless
associativity.
* doc/bison.texi (mfcalc example): Fix associativity to remove
warning.
Valentin Tolmer [Tue, 29 Jan 2013 13:55:53 +0000 (14:55 +0100)]
grammar: warn about unused precedence for symbols
Symbols with precedence but no associativity, and whose precedence is
never used, can be declared with %token instead. The used precedence
relationships are recorded and a warning about useless ones is issued.
* src/conflicts.c (resolve_sr_conflict): Record precedence relation.
* src/symtab.c, src/symtab.h (prec_nodes, init_prec_nodes)
(symgraphlink_new, register_precedence_second_symbol)
(print_precedence_warnings): New.
Record relationships in a graph and warn about useless ones.
* src/main.c (main): Print precedence warnings.
* tests/conflicts.at: New.
When using %define parse.assert, the variants come with additional variables
that are useful for development purposes. One is a Boolean indicating if the
variant is built (to make sure we don't read a non-built variant), and the
other is a string describing the stored type. There is no need to have both of
these, the string is enough.
The constructor for symbol_type doesn't take an ::std::string& as
argument, but a constant variant. However, because there is a variant
constructor which takes an ::std::string&, this caused the implicit
construction of a built variant. Considering that the variant argument
for the symbol_type constructor was cv-qualified, this temporary variant
was never destroyed.
As a temporary solution, the symbol was built in two stages:
symbol_type res (token::TOK_TEXT);
res.value.build< ::std::string&> (v);
return res;
However, the solution introduced in this patch contributes to letting
the symbols handle themselves, by supplying them with constructors that
take a non-variant value and build the symbol's own variant with that
value.
* data/variant.hh (b4_symbol_constructor_define_): Use the new
constructors rather than building in a temporary symbol.
(b4_basic_symbol_constructor_declare,
b4_basic_symbol_constructor_define): New macros generating the
constructors.
* data/c++.m4 (basic_symbol): Invoke the macros here.
Akim Demaille [Tue, 29 Jan 2013 07:52:57 +0000 (08:52 +0100)]
c++: please G++ 4.8 with -O3: array bounds
* data/c++.m4, data/lalr1.cc (by_state, by_type): Do not use -1 to
denote the absence of value, as GCC then fears that this -1 might
be used to dereference arrays (such as yytname).
Use 0, which corresponds to $accept, which is valueless (the needed
property: the symbol destructor must not try to reclaim the memory
associated with the symbol).
Akim Demaille [Tue, 29 Jan 2013 07:16:15 +0000 (08:16 +0100)]
c++: use more explicit types than int
* data/c++.m4 (b4_public_types_declare): Declare token_number_type soon.
Introduce symbol_number_type (wider than token_number_type).
Clarify the requirement that kind_type from by_state and by_type
denote the _input_ type (required by the constructor), not the stored type.
Use symbol_number_type and token_number_type where appropriate, instead
of int.
* data/lalr1.cc: Adjust to these changes.
Propagate "symbol_number_type".
Invoke "type_get ()" instead of read "type" directly.
Akim Demaille [Mon, 28 Jan 2013 17:27:15 +0000 (18:27 +0100)]
doxygen: upgrade Doxyfile, and complete it
* doc/Doxyfile.in: Let doxygen upgrade it.
(INCLUDE_PATH): Point to lib too.
(PROJECT_BRIEF): New.
(EXCLUDE): Update to reflect the current file hierarchy.
Akim Demaille [Mon, 28 Jan 2013 16:17:12 +0000 (17:17 +0100)]
maint: fix syntax-check issues
* cfg.mk: Ignore strcmp in local.at.
* tests/conflicts.at: Use AT_PARSER_CHECK.
* tests/regression.at: Preserve the exit status of the generated parsers.
* tests/local.mk ($(TESTSUITE)): Map @tb@ to a tabulation.
* tests/c++.at, tests/input.at, tests/regression.at: Use @tb@.
* cfg.mk: (space-tab): There are no longer exceptions.
Many 'inline' keywords were in the declarations. They rather belong in
definitions, so move them.
* data/c++.m4 (basic_symbol, by_type): Many inlines here.
* data/lalr1.cc (yytranslate_, yy_destroy_, by_state, yypush_, yypop_): Inline
these as well.
(move): Move the definition outside the struct, where it belongs.
Akim Demaille [Mon, 28 Jan 2013 15:05:09 +0000 (16:05 +0100)]
tests: check that using variants is exception safe
* tests/local.at: (Slightly) improve the regexp by escaping '.'
when it denotes a point.
(AT_VARIANT_IF): New.
* tests/c++.at (Exception Safety): Run it for variants too.
Akim Demaille [Mon, 28 Jan 2013 13:56:16 +0000 (14:56 +0100)]
c++: remove now-useless operators
Now that symbols behaves properly, we can eliminate special routines
that are no longer needed.
* data/c++.m4, data/glr.cc, data/lalr1.cc, data/variant.hh:
Remove useless assignment operators and copy constructors.
As a consequence, remove useless includes for "abort".
Akim Demaille [Mon, 28 Jan 2013 13:29:43 +0000 (14:29 +0100)]
c++: revamp the support for variants
The current approach was too adhoc: the symbols were not sufficiently
self-contained, in particular wrt memory management. The "new"
guideline is the one that should have been followed from the start:
let the symbols handle themslves, instead of leaving their users to
it. It was justified by the will to avoid gratuitious moves and
copies, but the current approach does not seem to be slower, yet it
will probably be simpler to adjust to support move semantics from
C++11.
The documentation says that the %parse-param are available from the
%destructor. In retrospect, that was a silly design decision, which
we can break for variants, as its a new feature. It should be phased
out for non-variants too.
* data/variant.hh: A variant never knows if it stores something or
not, it is up to its users to store this information.
Yet, in parse.assert mode, make sure the empty/filled variants
are properly used.
(b4_symbol_constructor_define_): Don't call directly the symbol
constructor, to save a useless temporary.
* data/stack.hh (push): Steal the pushed value instead of duplicating
it.
This will simplify the callers of push, who handled this "move"
approach themselves.
* data/c++.m4 (basic_symbol): Let -1, as kind, denote the fact that
a symbol is empty.
This is needed for instance when shifting the lookahead: yyla
is given as argument to "push", and its value is then moved on
the stack. But then yyla must be declared "empty" so that its
destructor won't be called.
(basic_symbol::move): New.
Move the responsibility of calling the destructor from yy_destroy
to ~basic_symbol in the case of variants.
* data/lalr1.cc (stack_symbol_type): Now a derived class from its
previous value, so that we can add a constructor from a symbol_type.
(by_state): State -1 means empty.
(yypush_): Factor, by calling one overload from the other one, and
using the new semantics of stack::push.
No longer reclaim by hand the memory from rhs symbols, since now
that we store objects with proper destructors, they will be reclaimed
automatically.
Conversely, be sure to delete yylhs.
* tests/c++.at (C++ Variant-based Symbols): New "unit" test for
symbols.
Valentin Tolmer [Fri, 25 Jan 2013 10:12:47 +0000 (11:12 +0100)]
grammar: preserve token declaration order
In a declaration %token A B, the token A is declared before B, but in %left
A B (or with %precedence or %nonassoc or %right), the token B was declared
before A (tokens were declared in reverse order).
* src/symlist.h, src/symlist.c (symbol_list_append): New.
* src/parse-gram.y: Use it instead of symbol_list_prepend.
* tests/input.at: Adjust expectations.
Akim Demaille [Fri, 25 Jan 2013 10:06:32 +0000 (11:06 +0100)]
tests: improve test group titles
* tests/local.at (AT_SETUP_STRIP): AT_SETUP does not behave properly
with new-lines in its argument.
Remove them.
Fix the handling of %define with quotes.
Akim Demaille [Fri, 25 Jan 2013 12:51:33 +0000 (13:51 +0100)]
c: no longer require stdio.h when locations are enabled
Recent changes (in 2.7) introduced a dependency on both FILE and
fprintf, which are "available" only in %debug mode. This was to
define yy_location_print_, which is used only in %debug mode by the
parser, but massively used by the test suite to output the locations
in yyerror.
Break this dependency: the test suite should define its own routines
to display the locations. Eventually Bison will provide the user with
a means to display locations, but not yet.
* data/c.m4 (b4_yy_location_print_define): Use YYFPRINTF instead of
fprintf directly.
* data/yacc.c (b4_yy_location_print_define): Invoke it only in %debug
mode, so that stdio.h is included (needed for FILE*), and YYFPRINTF
is defined.
* tests/local.at (AT_YYERROR_DECLARE, AT_YYERROR_DEFINE): Declare
and define location_print and LOCATION_PRINT.
* tests/actions.at, tests/existing.at, tests/glr-regression.at,
* tests/input.at, tests/named-refs.at, tests/regression.at: Adjust
to use them.
Fix the expected line numbers (as the prologue's length has changed).
* data/location.cc (operator<<): Display location exactly as is
done in C skeletons.
* tests/local.at (AT_LOC_PUSHDEF, AT_LOC_POPDEF): Also define
AT_FIRST_LINE, AT_LAST_LINE, AT_FIRST_COLUMN, AT_LAST_COLUMN.
* tests/actions.at (Location Print): Also check C++ skeletons.
Akim Demaille [Mon, 21 Jan 2013 15:01:53 +0000 (16:01 +0100)]
tests: generalize default main for api.namespace
* tests/local.at (AT_NAME_PREFIX): Also match api.namespace.
(AT_MAIN_DEFINE): Take it into account.
* tests/c++.at, tests/headers.at: Use AT_NAME_PREFIX.
(AT_CHECK_NAMESPACE): Rename as...
(AT_TEST): this.
Akim Demaille [Mon, 21 Jan 2013 14:38:49 +0000 (15:38 +0100)]
tests: improve factoring of the main function
* tests/local.at (AT_MAIN_DEFINE): If %debug is used, check if
-d/--debug is passed to the generated parser, and enable the traces.
Return exactly the result of yyparse, so that we can check exit code
2 too.
* tests/actions.at, tests/glr-regression.at, tests/regression.at:
Use AT_MAIN_DEFINE, helping AT_BISON_OPTION_PUSHDEFS where needed,
preferably to option -t.
There used to be a bug in some skeletons, which caused the expansion of
'yylval' and 'yylloc', generating these errors:
input.cc:547:16: error: expected ',' or '...' before '(' token
#define yylval (yystackp->yyval)
^
input.yy:29:39: note: in expansion of macro 'yylval'
int yylex (yy::parser::semantic_type *yylval)
^
This bug is fixed by 'skel: better aliasing of identifiers', but a workaround
is useful when benchmarking against older versions of Bison, which are still
affected by the bug.
* etc/bench.pl.in: Rename yylval to yylvalp and yylloc to yyllocp in base
grammar 'list'.
* data/c++.m4 (basic_symbol): Keep 'inline' in the prototypes, but don't
duplicate it in the implementation.
* data/variant.hh (variant): 'inline' is not needed when the implementation is
provided in the class definition.
* src/getargs.c (feature_flag): Here.
* tests/local.at (AT_BISON_CHECK_, AT_BISON_CHECK_NO_XML): Deactivate carets
for the testsuite, by default.
* tests/input.at: Adjust the locations for command line definitions.
* data/variant.hh (variant, operator=): Make private.
* data/c++.m4 (operator=): New, to avoid needing a definition of that operator
for each class member (such as a possible variant).
* data/glr.cc, data/lalr.cc: Add the necessary include for the abort.
A "symbol" groups together the symbol type (INT, PLUS, etc.), its
possible semantic value, and its optional location. The type is
needed to access the value, as it is stored as a variant/union.
There are two kinds of symbols. "symbol_type" are "external symbols":
they have type, value and location, and are returned by yylex.
"stack_symbol_type" are "internal symbols", they group state number,
value and location, and are stored in the parser stack. The type of
the symbol is computed from the state number.
The class template symbol_base_type<Exact> factors the code common to
stack_symbol_type and symbol_type. It uses the Curiously Recurring
Template pattern so that we can always (static_) downcast to the exact
type. symbol_base_type features value and location, and delegates the
handling of the type to its parameter.
When trying to generalize the support for variant, a significant issue
was revealed: because stack_symbol_type and symbol_type _derive_ from
symbol_base_type, the type/state member is defined _after_ the value
and location. In C++ the order of the definition of the members
defines the order in which they are initialized, things go backward:
the value is initialized _before_ the type. This is wrong, since the
type is needed to access the value.
Therefore, we need another means to factor the common code, one that
ensures the order of the members.
The idea is simple: define two (base) classes that code the symbol
type ("by_type" codes it by its type, and "by_state" by the state
number). Define basic_symbol<Base> as the class template that
provides value and location support. Make it _derive_ from its
parameter, by_type or by_state. Then define stack_symbol_type and
symbol_type as basic_symbol<by_state>, basic_symbol<by_type>. The
name basic_symbol was chosen by similarity with basic_string and
basic_ostream.
* data/c++.m4 (symbol_base_type<Exact>): Remove, replace by...
(basic_symbol<Base>): which derives from its parameter, one of...
(by_state, by_type): which provide means to retrieve the actual type of
symbol.
(symbol_type): Is now basic_symbol<by_type>.
(stack_symbol_type): Is now basic_symbol<by_state>.
* data/lalr1.cc: Many adjustments.
Akim Demaille [Mon, 31 Dec 2012 15:21:34 +0000 (16:21 +0100)]
doc: use deffn to declare the list of %define variables
* doc/bison.texi (%define Summary): Use @deffn instead of @table, it
spares a lot of width, especially in PDF, and looks nicer in the other
formats too.
It is also more consistent with the rest of the document.
Akim Demaille [Fri, 28 Dec 2012 10:25:02 +0000 (11:25 +0100)]
syncline: one line is enough
So far we were issuing two lines for each syncline change:
/* Line 356 of yacc.c */
#line 1 "src/parse-gram.y"
This is a lot of clutter, especially when reading diffs, as these
lines change often. Fuse them into a single, shorter, line:
#line 1 "src/parse-gram.y" /* yacc.c:356 */
* data/bison.m4 (b4_syncline): Issue a single line.
Comment improvements.
(b4_sync_start, b4_sync_end): Issue a shorter comment.
* data/c++.m4 (b4_semantic_type_declare): b4_user_code must be
on its own line as it might start with a "#line" directive.
Akim Demaille [Fri, 28 Dec 2012 09:04:49 +0000 (10:04 +0100)]
graph: minor simplification
* src/gram.c (print_lhs): Use %*s to indent.
* src/print_graph.c (print_lhs): Use obstack_printf.
Became simple enough to be inlined in...
(print_core): here.
Use a "rule*" instead of an index in "rules[]".
carets: properly display when no line feed is present
* src/location.c (location_caret): finish the line with one whether or not it
is present in input. Rewrite code without getline.
(cleanup_caret): Reset the caret_info global.
* bootstrap.conf: No longer require getline.
Unput was no longer used since a POSIX-compatiblity issue with Flex 2.5.31,
which has been adressed in newer versions of Flex. See this discussion:
<http://lists.gnu.org/archive/html/bug-bison/2003-04/msg00029.html>