X-Git-Url: https://git.saurik.com/bison.git/blobdiff_plain/12545799f9baa195153eea94f4b48bab1a072072..f7ab6a5010b6cac6eaa6b6d9e54168764bffed7a:/doc/bison.texinfo diff --git a/doc/bison.texinfo b/doc/bison.texinfo index 189799a8..a8c60bc4 100644 --- a/doc/bison.texinfo +++ b/doc/bison.texinfo @@ -243,6 +243,8 @@ Parser C-Language Interface which reads tokens. * Error Reporting:: You must supply a function @code{yyerror}. * Action Features:: Special features for use in actions. +* Internationalization:: How to let the parser speak in the user's + native language. The Lexical Analyzer Function @code{yylex} @@ -1187,14 +1189,7 @@ start with a function called @code{main}; you have to provide this, and arrange for it to call @code{yyparse} or the parser will never run. @xref{Interface, ,Parser C-Language Interface}. -If your code defines a C preprocessor macro @code{_} (a single -underscore), Bison assumes that it can be used to translate -English-language strings to the user's preferred language using a -function-like syntax, e.g., @code{_("syntax error")}. Otherwise, -Bison defines a no-op macro by that name that merely returns its -argument, so strings are not translated. - -Aside from @code{_} and the token type names and the symbols in the actions you +Aside from the token type names and the symbols in the actions you write, all symbols defined in the Bison parser file itself begin with @samp{yy} or @samp{YY}. This includes interface functions such as the lexical analyzer function @code{yylex}, the error reporting @@ -4250,6 +4245,8 @@ in the grammar file, you are likely to run into trouble. which reads tokens. * Error Reporting:: You must supply a function @code{yyerror}. * Action Features:: Special features for use in actions. +* Internationalization:: How to let the parser speak in the user's + native language. @end menu @node Parser Function @@ -4812,6 +4809,84 @@ of the @var{n}th component of the current rule. @xref{Locations, , Tracking Locations}. @end deffn +@node Internationalization +@section Parser Internationalization +@cindex internationalization +@cindex i18n +@cindex NLS +@cindex gettext +@cindex bison-po + +A Bison-generated parser can print diagnostics, including error and +tracing messages. By default, they appear in English. However, Bison +also supports outputting diagnostics in the user's native language. +To make this work, the user should set the usual environment +variables. @xref{Using gettextized software, , User influence on +@code{gettext}, libc, The GNU C Library Reference Manual}. For +example, the shell command @samp{export LC_ALL=fr_CA.UTF-8} might set +the user's locale to French Canadian using the @acronym{UTF}-8 +encoding. The exact set of available locales depends on the user's +installation. + +The maintainer of a package that uses a Bison-generated parser enables +the internationalization of the parser's output through the following +steps. Here we assume a package that uses @acronym{GNU} Autoconf and +@acronym{GNU} Automake. + +@enumerate +@item +Into the directory containing the @acronym{GNU} Autoconf macros used +by the package---often called @file{m4}---copy the +@file{bison-i18n.m4} file installed by Bison under +@samp{share/aclocal/bison-i18n.m4} in Bison's installation directory. +For example: + +@example +cp /usr/local/share/aclocal/bison-i18n.m4 m4/bison-i18n.m4 +@end example + +@item +In the top-level @file{configure.ac}, after the @code{AM_GNU_GETTEXT} +invocation, add an invocation of @code{BISON_I18N}. This macro is +defined in the file @file{bison-i18n.m4} that you copied earlier. It +causes @samp{configure} to find the value of the +@code{BISON_LOCALEDIR} variable. + +@item +In the @code{main} function of your program, designate the directory +containing Bison's runtime message catalog, through a call to +@samp{bindtextdomain} with domain name @samp{bison-runtime}. +For example: + +@example +bindtextdomain ("bison-runtime", BISON_LOCALEDIR); +@end example + +Typically this appears after any other call @code{bindtextdomain +(PACKAGE, LOCALEDIR)} that your package already has. Here we rely on +@samp{BISON_LOCALEDIR} to be defined as a string through the +@file{Makefile}. + +@item +In the @file{Makefile.am} that controls the compilation of the @code{main} +function, make @samp{BISON_LOCALEDIR} available as a C preprocessor macro, +either in @samp{DEFS} or in @samp{AM_CPPFLAGS}. For example: + +@example +DEFS = @@DEFS@@ -DBISON_LOCALEDIR='"$(BISON_LOCALEDIR)"' +@end example + +or: + +@example +AM_CPPFLAGS = -DBISON_LOCALEDIR='"$(BISON_LOCALEDIR)"' +@end example + +@item +Finally, invoke the command @command{autoreconf} to generate the build +infrastructure. +@end enumerate + @node Algorithm @chapter The Bison Parser Algorithm @@ -5496,6 +5571,13 @@ return_spec: ; @end example +For a more detailed exposition of @acronym{LALR}(1) parsers and parser +generators, please see: +Frank DeRemer and Thomas Pennello, Efficient Computation of +@acronym{LALR}(1) Look-Ahead Sets, @cite{@acronym{ACM} Transactions on +Programming Languages and Systems}, Vol.@: 4, No.@: 4 (October 1982), +pp.@: 615--649 @uref{http://doi.acm.org/10.1145/69622.357187}. + @node Generalized LR Parsing @section Generalized @acronym{LR} (@acronym{GLR}) Parsing @cindex @acronym{GLR} parsing @@ -6568,6 +6650,9 @@ Print a summary of the command-line options to Bison and exit. @itemx --version Print the version number of Bison and exit. +@item --print-localedir +Print the name of the directory containing locale-dependent data. + @need 1750 @item -y @itemx --yacc @@ -6707,6 +6792,7 @@ the corresponding short option. \line{ --no-lines \leaderfill -l} \line{ --no-parser \leaderfill -n} \line{ --output \leaderfill -o} +\line{ --print-localedir} \line{ --token-table \leaderfill -k} \line{ --verbose \leaderfill -v} \line{ --version \leaderfill -V} @@ -6725,6 +6811,7 @@ the corresponding short option. --no-lines -l --no-parser -n --output=@var{outfile} -o @var{outfile} +--print-localedir --token-table -k --verbose -v --version -V @@ -6785,7 +6872,7 @@ int yyparse (void); @c - Always pure @c - initial action -The C++ parser LALR(1) skeleton is named @file{lalr1.cc}. To select +The C++ parser @acronym{LALR}(1) skeleton is named @file{lalr1.cc}. To select it, you may either pass the option @option{--skeleton=lalr1.cc} to Bison, or include the directive @samp{%skeleton "lalr1.cc"} in the grammar preamble. When run, @command{bison} will create several @@ -6928,11 +7015,10 @@ this class is detailled below. It can be extended using the it describes an additional member of the parser class, and an additional argument for its constructor. -@deftypemethod {parser} {semantic_value_type} -@deftypemethodx {parser} {location_value_type} +@defcv {Type} {parser} {semantic_value_type} +@defcvx {Type} {parser} {location_value_type} The types for semantics value and locations. -@c FIXME: deftypemethod pour des types ??? -@end deftypemethod +@end defcv @deftypemethod {parser} {} parser (@var{type1} @var{arg1}, ...) Build a new parser object. There are no arguments by default, unless @@ -7032,6 +7118,7 @@ The declaration of this driver class, @file{calc++-driver.hh}, is as follows. The first part includes the CPP guard and imports the required standard library components. +@comment file: calc++-driver.hh @example #ifndef CALCXX_DRIVER_HH # define CALCXX_DRIVER_HH @@ -7046,10 +7133,15 @@ do. Because the driver's declaration is the one that will be imported by the rest of the project, it is saner to forward declare the parser's information here. +@comment file: calc++-driver.hh @example // Forward declarations. union YYSTYPE; -namespace yy @{ class calcxx_parser; @} +namespace yy +@{ + class location; + class calcxx_parser; +@} class calcxx_driver; @end example @@ -7058,9 +7150,11 @@ Then comes the declaration of the scanning function. Flex expects the signature of @code{yylex} to be defined in the macro @code{YY_DECL}, and the C++ parser expects it to be declared. We can factor both as follows. + +@comment file: calc++-driver.hh @example // Announce to Flex the prototype we want for lexing function, ... -# define YY_DECL \ +# define YY_DECL \ int yylex (YYSTYPE* yylval, yy::location* yylloc, calcxx_driver& driver) // ... and declare it for the parser's sake. YY_DECL; @@ -7070,6 +7164,7 @@ YY_DECL; The @code{calcxx_driver} class is then declared with its most obvious members. +@comment file: calc++-driver.hh @example // Conducting the whole scanning and parsing of Calc++. class calcxx_driver @@ -7088,6 +7183,7 @@ To encapsulate the coordination with the Flex scanner, it is useful to have two members function to open and close the scanning phase. members. +@comment file: calc++-driver.hh @example // Handling the scanner. void scan_begin (); @@ -7098,6 +7194,7 @@ members. @noindent Similarly for the parser itself. +@comment file: calc++-driver.hh @example // Handling the parser. void parse (const std::string& f); @@ -7111,6 +7208,7 @@ dumping them on the standard error output, we will pass them to the compiler driver using the following two member functions. Finally, we close the class declaration and CPP guard. +@comment file: calc++-driver.hh @example // Error handling. void error (const yy::location& l, const std::string& m); @@ -7124,6 +7222,7 @@ member function deserves some attention. The @code{error} functions are simple stubs, they should actually register the located error messages and set error state. +@comment file: calc++-driver.cc @example #include "calc++-driver.hh" #include "calc++-parser.hh" @@ -7170,6 +7269,8 @@ The parser definition file @file{calc++-parser.yy} starts by asking for the C++ skeleton, the creation of the parser header file, and specifies the name of the parser class. It then includes the required headers. + +@comment file: calc++-parser.yy @example %skeleton "lalr1.cc" /* -*- C++ -*- */ %define "parser_class_name" "calcxx_parser" @@ -7185,6 +7286,7 @@ The driver is passed by reference to the parser and to the scanner. This provides a simple but effective pure interface, not relying on global variables. +@comment file: calc++-parser.yy @example // The parsing context. %parse-param @{ calcxx_driver& driver @} @@ -7197,6 +7299,7 @@ first location's file name. Afterwards new locations are computed relatively to the previous locations: the file name will be automatically propagated. +@comment file: calc++-parser.yy @example %locations %initial-action @@ -7210,6 +7313,7 @@ automatically propagated. Use the two following directives to enable parser tracing and verbose error messages. +@comment file: calc++-parser.yy @example %debug %error-verbose @@ -7219,6 +7323,7 @@ error messages. Semantic values cannot use ``real'' objects, but only pointers to them. +@comment file: calc++-parser.yy @example // Symbols. %union @@ -7235,6 +7340,7 @@ of ``$end''. Similarly user friendly named are provided for each symbol. Note that the tokens names are prefixed by @code{TOKEN_} to avoid name clashes. +@comment file: calc++-parser.yy @example %token YYEOF 0 "end of file" %token TOKEN_ASSIGN ":=" @@ -7247,6 +7353,7 @@ avoid name clashes. To enable memory deallocation during error recovery, use @code{%destructor}. +@comment file: calc++-parser.yy @example %printer @{ debug_stream () << *$$; @} "identifier" %destructor @{ delete $$; @} "identifier" @@ -7257,6 +7364,7 @@ To enable memory deallocation during error recovery, use @noindent The grammar itself is straightforward. +@comment file: calc++-parser.yy @example %% %start unit; @@ -7282,9 +7390,11 @@ exp: exp '+' exp @{ $$ = $1 + $3; @} Finally the @code{error} member function registers the errors to the driver. +@comment file: calc++-parser.yy @example void -yy::calcxx_parser::error (const location_type& l, const std::string& m) +yy::calcxx_parser::error (const yy::calcxx_parser::location_type& l, + const std::string& m) @{ driver.error (l, m); @} @@ -7296,6 +7406,7 @@ yy::calcxx_parser::error (const location_type& l, const std::string& m) The Flex scanner first includes the driver declaration, then the parser's to get the set of defined tokens. +@comment file: calc++-scanner.ll @example %@{ /* -*- C++ -*- */ # include @@ -7310,6 +7421,7 @@ Because there is no @code{#include}-like feature we don't need actual file, this is not an interactive session with the user. Finally we enable the scanner tracing features. +@comment file: calc++-scanner.ll @example %option noyywrap nounput batch debug @end example @@ -7317,6 +7429,7 @@ Finally we enable the scanner tracing features. @noindent Abbreviations allow for more readable rules. +@comment file: calc++-scanner.ll @example id [a-zA-Z][a-zA-Z_0-9]* int [0-9]+ @@ -7332,11 +7445,14 @@ cursor is adjusted, and each time blanks are matched, the begin cursor is moved onto the end cursor to effectively ignore the blanks preceding tokens. Comments would be treated equally. +@comment file: calc++-scanner.ll @example +%@{ +# define YY_USER_ACTION yylloc->columns (yyleng); +%@} %% %@{ yylloc->step (); -# define YY_USER_ACTION yylloc->columns (yyleng); %@} @{blank@}+ yylloc->step (); [\n]+ yylloc->lines (yyleng); yylloc->step (); @@ -7346,6 +7462,7 @@ preceding tokens. Comments would be treated equally. The rules are simple, just note the use of the driver to report errors. +@comment file: calc++-scanner.ll @example [-+*/] return yytext[0]; ":=" return TOKEN_ASSIGN; @@ -7359,6 +7476,7 @@ errors. Finally, because the scanner related driver's member function depend on the scanner's data, it is simpler to implement them in this file. +@comment file: calc++-scanner.ll @example void calcxx_driver::scan_begin () @@ -7380,6 +7498,7 @@ calcxx_driver::scan_end () The top level file, @file{calc++.cc}, poses no problem. +@comment file: calc++.cc @example #include #include "calc++-driver.hh"