removed more unneeded files, see patch 890642

author Václav Slavík <vslavik@fastmail.fm>

Mon, 8 Mar 2004 09:41:41 +0000 (09:41 +0000)

committer Václav Slavík <vslavik@fastmail.fm>

Mon, 8 Mar 2004 09:41:41 +0000 (09:41 +0000)
author Václav Slavík <vslavik@fastmail.fm>
Mon, 8 Mar 2004 09:41:41 +0000 (09:41 +0000)
committer Václav Slavík <vslavik@fastmail.fm>
Mon, 8 Mar 2004 09:41:41 +0000 (09:41 +0000)
diff --git a/src/regex/Makefile b/src/regex/Makefile

deleted file mode 100644 (file)

index cb76d37..0000000
--- a/src/regex/Makefile
+++ /dev/null
@@ -1,28 +0,0 @@
-#-------------------------------------------------------------------------
-#
-# Makefile--
-#    Makefile for backend/regex
-#
-# IDENTIFICATION
-#    $Header: /projects/cvsroot/pgsql-server/src/backend/regex/Makefile,v 1.20 2003/02/05 17:41:32 tgl Exp $
-#
-#-------------------------------------------------------------------------
-
-subdir = src/backend/regex
-top_builddir = ../../..
-include $(top_builddir)/src/Makefile.global
-
-OBJS = regcomp.o regerror.o regexec.o regfree.o
-
-all: SUBSYS.o
-
-SUBSYS.o: $(OBJS)
-       $(LD) $(LDREL) $(LDOUT) SUBSYS.o $(OBJS)
-
-# mark inclusion dependencies between .c files explicitly
-regcomp.o: regcomp.c regc_lex.c regc_color.c regc_nfa.c regc_cvec.c regc_locale.c
-
-regexec.o: regexec.c rege_dfa.c
-
-clean: 
-       rm -f SUBSYS.o $(OBJS)
diff --git a/src/regex/WHATSNEW b/src/regex/WHATSNEW

deleted file mode 100644 (file)

index 1295343..0000000
--- a/src/regex/WHATSNEW
+++ /dev/null
@@ -1,108 +0,0 @@
-New in alpha3.8:  Bug fix for signed/unsigned mixup, found and fixed
-by the FreeBSD folks.
-
-New in alpha3.7:  A bit of cleanup aimed at maximizing portability,
-possibly at slight cost in efficiency.  "ul" suffixes and "unsigned long"
-no longer appear, in particular.
-
-New in alpha3.6:  A couple more portability glitches fixed.
-
-New in alpha3.5:  Active development of this code has been stopped --
-I'm working on a complete reimplementation -- but folks have found some
-minor portability glitches and the like, hence this release to fix them.
-One penalty:  slightly reduced compatibility with old compilers, because
-the ANSI C `unsigned long' type and `ul' constant suffix are used in a
-few places (I could avoid this but it would be considerably more work).
-
-New in alpha3.4:  The complex bug alluded to below has been fixed (in a
-slightly kludgey temporary way that may hurt efficiency a bit; this is
-another "get it out the door for 4.4" release).  The tests at the end of
-the tests file have accordingly been uncommented.  The primary sign of
-the bug was that something like a?b matching ab matched b rather than ab.
-(The bug was essentially specific to this exact situation, else it would
-have shown up earlier.)
-
-New in alpha3.3:  The definition of word boundaries has been altered
-slightly, to more closely match the usual programming notion that "_"
-is an alphabetic.  Stuff used for pre-ANSI systems is now in a subdir,
-and the makefile no longer alludes to it in mysterious ways.  The
-makefile has generally been cleaned up some.  Fixes have been made
-(again!) so that the regression test will run without -DREDEBUG, at
-the cost of weaker checking.  A workaround for a bug in some folks'
-<assert.h> has been added.  And some more things have been added to
-tests, including a couple right at the end which are commented out
-because the code currently flunks them (complex bug; fix coming).
-Plus the usual minor cleanup.
-
-New in alpha3.2:  Assorted bits of cleanup and portability improvement
-(the development base is now a BSDI system using GCC instead of an ancient
-Sun system, and the newer compiler exposed some glitches).  Fix for a
-serious bug that affected REs using many [] (including REG_ICASE REs
-because of the way they are implemented), *sometimes*, depending on
-memory-allocation patterns.  The header-file prototypes no longer name
-the parameters, avoiding possible name conflicts.  The possibility that
-some clot has defined CHAR_MIN as (say) `-128' instead of `(-128)' is
-now handled gracefully.  "uchar" is no longer used as an internal type
-name (too many people have the same idea).  Still the same old lousy
-performance, alas.
-
-New in alpha3.1:  Basically nothing, this release is just a bookkeeping
-convenience.  Stay tuned.
-
-New in alpha3.0:  Performance is no better, alas, but some fixes have been
-made and some functionality has been added.  (This is basically the "get
-it out the door in time for 4.4" release.)  One bug fix:  regfree() didn't
-free the main internal structure (how embarrassing).  It is now possible
-to put NULs in either the RE or the target string, using (resp.) a new
-REG_PEND flag and the old REG_STARTEND flag.  The REG_NOSPEC flag to
-regcomp() makes all characters ordinary, so you can match a literal
-string easily (this will become more useful when performance improves!).
-There are now primitives to match beginnings and ends of words, although
-the syntax is disgusting and so is the implementation.  The REG_ATOI
-debugging interface has changed a bit.  And there has been considerable
-internal cleanup of various kinds.
-
-New in alpha2.3:  Split change list out of README, and moved flags notes
-into Makefile.  Macro-ized the name of regex(7) in regex(3), since it has
-to change for 4.4BSD.  Cleanup work in engine.c, and some new regression
-tests to catch tricky cases thereof.
-
-New in alpha2.2:  Out-of-date manpages updated.  Regerror() acquires two
-small extensions -- REG_ITOA and REG_ATOI -- which avoid debugging kludges
-in my own test program and might be useful to others for similar purposes.
-The regression test will now compile (and run) without REDEBUG.  The
-BRE \$ bug is fixed.  Most uses of "uchar" are gone; it's all chars now.
-Char/uchar parameters are now written int/unsigned, to avoid possible
-portability problems with unpromoted parameters.  Some unsigned casts have
-been introduced to minimize portability problems with shifting into sign
-bits.
-
-New in alpha2.1:  Lots of little stuff, cleanup and fixes.  The one big
-thing is that regex.h is now generated, using mkh, rather than being
-supplied in the distribution; due to circularities in dependencies,
-you have to build regex.h explicitly by "make h".  The two known bugs
-have been fixed (and the regression test now checks for them), as has a
-problem with assertions not being suppressed in the absence of REDEBUG.
-No performance work yet.
-
-New in alpha2:  Backslash-anything is an ordinary character, not an
-error (except, of course, for the handful of backslashed metacharacters
-in BREs), which should reduce script breakage.  The regression test
-checks *where* null strings are supposed to match, and has generally
-been tightened up somewhat.  Small bug fixes in parameter passing (not
-harmful, but technically errors) and some other areas.  Debugging
-invoked by defining REDEBUG rather than not defining NDEBUG.
-
-New in alpha+3:  full prototyping for internal routines, using a little
-helper program, mkh, which extracts prototypes given in stylized comments.
-More minor cleanup.  Buglet fix:  it's CHAR_BIT, not CHAR_BITS.  Simple
-pre-screening of input when a literal string is known to be part of the
-RE; this does wonders for performance.
-
-New in alpha+2:  minor bits of cleanup.  Notably, the number "32" for the
-word width isn't hardwired into regexec.c any more, the public header
-file prototypes the functions if __STDC__ is defined, and some small typos
-in the manpages have been fixed.
-
-New in alpha+1:  improvements to the manual pages, and an important
-extension, the REG_STARTEND option to regexec().
diff --git a/src/regex/engine.ih b/src/regex/engine.ih

deleted file mode 100644 (file)

index cc98334..0000000
--- a/src/regex/engine.ih
+++ /dev/null
@@ -1,35 +0,0 @@
-/* ========= begin header generated by ./mkh ========= */
-#ifdef __cplusplus
-extern "C" {
-#endif
-
-/* === engine.c === */
-static int matcher(register struct re_guts *g, char *string, size_t nmatch, regmatch_t pmatch[], int eflags);
-static char *dissect(register struct match *m, char *start, char *stop, sopno startst, sopno stopst);
-static char *backref(register struct match *m, char *start, char *stop, sopno startst, sopno stopst, sopno lev);
-static char *fast(register struct match *m, char *start, char *stop, sopno startst, sopno stopst);
-static char *slow(register struct match *m, char *start, char *stop, sopno startst, sopno stopst);
-static states step(register struct re_guts *g, sopno start, sopno stop, register states bef, int ch, register states aft);
-#define        BOL     (OUT+1)
-#define        EOL     (BOL+1)
-#define        BOLEOL  (BOL+2)
-#define        NOTHING (BOL+3)
-#define        BOW     (BOL+4)
-#define        EOW     (BOL+5)
-#define        CODEMAX (BOL+5)         /* highest code used */
-#define        NONCHAR(c)      ((c) > CHAR_MAX)
-#define        NNONCHAR        (CODEMAX-CHAR_MAX)
-#ifdef REDEBUG
-static void print(struct match *m, char *caption, states st, int ch, FILE *d);
-#endif
-#ifdef REDEBUG
-static void at(struct match *m, char *title, char *start, char *stop, sopno startst, sopno stopst);
-#endif
-#ifdef REDEBUG
-static char *pchar(int ch);
-#endif
-
-#ifdef __cplusplus
-}
-#endif
-/* ========= end header generated by ./mkh ========= */
diff --git a/src/regex/mkh b/src/regex/mkh

deleted file mode 100644 (file)

index 252b246..0000000
--- a/src/regex/mkh
+++ /dev/null
@@ -1,76 +0,0 @@
-#! /bin/sh
-# mkh - pull headers out of C source
-PATH=/bin:/usr/bin ; export PATH
-
-# egrep pattern to pick out marked lines
-egrep='^ =([   ]|$)'
-
-# Sed program to process marked lines into lines for the header file.
-# The markers have already been removed.  Two things are done here:  removal
-# of backslashed newlines, and some fudging of comments.  The first is done
-# because -o needs to have prototypes on one line to strip them down.
-# Getting comments into the output is tricky; we turn C++-style // comments
-# into /* */ comments, after altering any existing */'s to avoid trouble.
-peel=' /\\$/N
-       /\\\n[  ]*/s///g
-       /\/\//s;\*/;* /;g
-       /\/\//s;//\(.*\);/*\1 */;'
-
-for a
-do
-       case "$a" in
-       -o)     # old (pre-function-prototype) compiler
-               # add code to comment out argument lists
-               peel="$peel
-                       "'/^\([^#\/][^\/]*[a-zA-Z0-9_)]\)(\(.*\))/s;;\1(/*\2*/);'
-               shift
-               ;;
-       -b)     # funny Berkeley __P macro
-               peel="$peel
-                       "'/^\([^#\/][^\/]*[a-zA-Z0-9_)]\)(\(.*\))/s;;\1 __P((\2));'
-               shift
-               ;;
-       -s)     # compiler doesn't like `static foo();'
-               # add code to get rid of the `static'
-               peel="$peel
-                       "'/^static[     ][^\/]*[a-zA-Z0-9_)](.*)/s;static.;;'
-               shift
-               ;;
-       -p)     # private declarations
-               egrep='^ ==([   ]|$)'
-               shift
-               ;;
-       -i)     # wrap in #ifndef, argument is name
-               ifndef="$2"
-               shift ; shift
-               ;;
-       *)      break
-               ;;
-       esac
-done
-
-if test " $ifndef" != " "
-then
-       echo "#ifndef $ifndef"
-       echo "#define   $ifndef /* never again */"
-fi
-echo "/* ========= begin header generated by $0 ========= */"
-echo '#ifdef __cplusplus'
-echo 'extern "C" {'
-echo '#endif'
-for f
-do
-       echo
-       echo "/* === $f === */"
-       egrep "$egrep" $f | sed 's/^ ==*[       ]//;s/^ ==*$//' | sed "$peel"
-       echo
-done
-echo '#ifdef __cplusplus'
-echo '}'
-echo '#endif'
-echo "/* ========= end header generated by $0 ========= */"
-if test " $ifndef" != " "
-then
-       echo "#endif"
-fi
-exit 0
diff --git a/src/regex/regcomp.ih b/src/regex/regcomp.ih

deleted file mode 100644 (file)

index 6efafeb..0000000
--- a/src/regex/regcomp.ih
+++ /dev/null
@@ -1,53 +0,0 @@
-/* ========= begin header generated by ./mkh ========= */
-#ifdef __cplusplus
-extern "C" {
-#endif
-
-/* === regcomp.c === */
-static void p_ere(register struct parse *p, int stop);
-static void p_ere_exp(register struct parse *p);
-static void p_str(register struct parse *p);
-static void p_bre(register struct parse *p, register int end1, register int end2);
-static int p_simp_re(register struct parse *p, int starordinary);
-static int p_count(register struct parse *p);
-static void p_bracket(register struct parse *p);
-static void p_b_term(register struct parse *p, register cset *cs);
-static void p_b_cclass(register struct parse *p, register cset *cs);
-static void p_b_eclass(register struct parse *p, register cset *cs);
-static char p_b_symbol(register struct parse *p);
-static char p_b_coll_elem(register struct parse *p, int endc);
-static char othercase(int ch);
-static void bothcases(register struct parse *p, int ch);
-static void ordinary(register struct parse *p, register int ch);
-static void nonnewline(register struct parse *p);
-static void repeat(register struct parse *p, sopno start, int from, int to);
-static int seterr(register struct parse *p, int e);
-static cset *allocset(register struct parse *p);
-static void freeset(register struct parse *p, register cset *cs);
-static int freezeset(register struct parse *p, register cset *cs);
-static int firstch(register struct parse *p, register cset *cs);
-static int nch(register struct parse *p, register cset *cs);
-static void mcadd(register struct parse *p, register cset *cs, register char *cp);
-#if 0
-static void mcsub(register cset *cs, register char *cp);
-static int mcin(register cset *cs, register char *cp);
-static char *mcfind(register cset *cs, register char *cp);
-#endif
-static void mcinvert(register struct parse *p, register cset *cs);
-static void mccase(register struct parse *p, register cset *cs);
-static int isinsets(register struct re_guts *g, int c);
-static int samesets(register struct re_guts *g, int c1, int c2);
-static void categorize(struct parse *p, register struct re_guts *g);
-static sopno dupl(register struct parse *p, sopno start, sopno finish);
-static void doemit(register struct parse *p, sop op, size_t opnd);
-static void doinsert(register struct parse *p, sop op, size_t opnd, sopno pos);
-static void dofwd(register struct parse *p, sopno pos, sop value);
-static void enlarge(register struct parse *p, sopno size);
-static void stripsnug(register struct parse *p, register struct re_guts *g);
-static void findmust(register struct parse *p, register struct re_guts *g);
-static sopno pluscount(register struct parse *p, register struct re_guts *g);
-
-#ifdef __cplusplus
-}
-#endif
-/* ========= end header generated by ./mkh ========= */
diff --git a/src/regex/regerror.ih b/src/regex/regerror.ih

deleted file mode 100644 (file)

index 2cb668c..0000000
--- a/src/regex/regerror.ih
+++ /dev/null
@@ -1,12 +0,0 @@
-/* ========= begin header generated by ./mkh ========= */
-#ifdef __cplusplus
-extern "C" {
-#endif
-
-/* === regerror.c === */
-static char *regatoi(const regex_t *preg, char *localbuf);
-
-#ifdef __cplusplus
-}
-#endif
-/* ========= end header generated by ./mkh ========= */
diff --git a/src/regex/regex.3 b/src/regex/regex.3

deleted file mode 100644 (file)

index bc74709..0000000
--- a/src/regex/regex.3
+++ /dev/null
@@ -1,509 +0,0 @@
-.TH REGEX 3 "25 Sept 1997"
-.BY "Henry Spencer"
-.de ZR
-.\" one other place knows this name:  the SEE ALSO section
-.IR regex (7) \\$1
-..
-.SH NAME
-regcomp, regexec, regerror, regfree \- regular-expression library
-.SH SYNOPSIS
-.ft B
-.\".na
-#include <sys/types.h>
-.br
-#include <regex.h>
-.HP 10
-int regcomp(regex_t\ *preg, const\ char\ *pattern, int\ cflags);
-.HP
-int\ regexec(const\ regex_t\ *preg, const\ char\ *string,
-size_t\ nmatch, regmatch_t\ pmatch[], int\ eflags);
-.HP
-size_t\ regerror(int\ errcode, const\ regex_t\ *preg,
-char\ *errbuf, size_t\ errbuf_size);
-.HP
-void\ regfree(regex_t\ *preg);
-.\".ad
-.ft
-.SH DESCRIPTION
-These routines implement POSIX 1003.2 regular expressions (``RE''s);
-see
-.ZR .
-.I Regcomp
-compiles an RE written as a string into an internal form,
-.I regexec
-matches that internal form against a string and reports results,
-.I regerror
-transforms error codes from either into human-readable messages,
-and
-.I regfree
-frees any dynamically-allocated storage used by the internal form
-of an RE.
-.PP
-The header
-.I <regex.h>
-declares two structure types,
-.I regex_t
-and
-.IR regmatch_t ,
-the former for compiled internal forms and the latter for match reporting.
-It also declares the four functions,
-a type
-.IR regoff_t ,
-and a number of constants with names starting with ``REG_''.
-.PP
-.I Regcomp
-compiles the regular expression contained in the
-.I pattern
-string,
-subject to the flags in
-.IR cflags ,
-and places the results in the
-.I regex_t
-structure pointed to by
-.IR preg .
-.I Cflags
-is the bitwise OR of zero or more of the following flags:
-.IP REG_EXTENDED \w'REG_EXTENDED'u+2n
-Compile modern (``extended'') REs,
-rather than the obsolete (``basic'') REs that
-are the default.
-.IP REG_BASIC
-This is a synonym for 0,
-provided as a counterpart to REG_EXTENDED to improve readability.
-This is an extension,
-compatible with but not specified by POSIX 1003.2,
-and should be used with
-caution in software intended to be portable to other systems.
-.IP REG_NOSPEC
-Compile with recognition of all special characters turned off.
-All characters are thus considered ordinary,
-so the ``RE'' is a literal string.
-This is an extension,
-compatible with but not specified by POSIX 1003.2,
-and should be used with
-caution in software intended to be portable to other systems.
-REG_EXTENDED and REG_NOSPEC may not be used
-in the same call to
-.IR regcomp .
-.IP REG_ICASE
-Compile for matching that ignores upper/lower case distinctions.
-See
-.ZR .
-.IP REG_NOSUB
-Compile for matching that need only report success or failure,
-not what was matched.
-.IP REG_NEWLINE
-Compile for newline-sensitive matching.
-By default, newline is a completely ordinary character with no special
-meaning in either REs or strings.
-With this flag,
-`[^' bracket expressions and `.' never match newline,
-a `^' anchor matches the null string after any newline in the string
-in addition to its normal function,
-and the `$' anchor matches the null string before any newline in the
-string in addition to its normal function.
-.IP REG_PEND
-The regular expression ends,
-not at the first NUL,
-but just before the character pointed to by the
-.I re_endp
-member of the structure pointed to by
-.IR preg .
-The
-.I re_endp
-member is of type
-.IR const\ char\ * .
-This flag permits inclusion of NULs in the RE;
-they are considered ordinary characters.
-This is an extension,
-compatible with but not specified by POSIX 1003.2,
-and should be used with
-caution in software intended to be portable to other systems.
-.PP
-When successful,
-.I regcomp
-returns 0 and fills in the structure pointed to by
-.IR preg .
-One member of that structure
-(other than
-.IR re_endp )
-is publicized:
-.IR re_nsub ,
-of type
-.IR size_t ,
-contains the number of parenthesized subexpressions within the RE
-(except that the value of this member is undefined if the
-REG_NOSUB flag was used).
-If
-.I regcomp
-fails, it returns a non-zero error code;
-see DIAGNOSTICS.
-.PP
-.I Regexec
-matches the compiled RE pointed to by
-.I preg
-against the
-.IR string ,
-subject to the flags in
-.IR eflags ,
-and reports results using
-.IR nmatch ,
-.IR pmatch ,
-and the returned value.
-The RE must have been compiled by a previous invocation of
-.IR regcomp .
-The compiled form is not altered during execution of
-.IR regexec ,
-so a single compiled RE can be used simultaneously by multiple threads.
-.PP
-By default,
-the NUL-terminated string pointed to by
-.I string
-is considered to be the text of an entire line,
-with the NUL indicating the end of the line.
-(That is,
-any other end-of-line marker is considered to have been removed
-and replaced by the NUL.)
-The
-.I eflags
-argument is the bitwise OR of zero or more of the following flags:
-.IP REG_NOTBOL \w'REG_STARTEND'u+2n
-The first character of
-the string
-is not the beginning of a line, so the `^' anchor should not match before it.
-This does not affect the behavior of newlines under REG_NEWLINE.
-.IP REG_NOTEOL
-The NUL terminating
-the string
-does not end a line, so the `$' anchor should not match before it.
-This does not affect the behavior of newlines under REG_NEWLINE.
-.IP REG_STARTEND
-The string is considered to start at
-\fIstring\fR\ + \fIpmatch\fR[0].\fIrm_so\fR
-and to have a terminating NUL located at
-\fIstring\fR\ + \fIpmatch\fR[0].\fIrm_eo\fR
-(there need not actually be a NUL at that location),
-regardless of the value of
-.IR nmatch .
-See below for the definition of
-.IR pmatch
-and
-.IR nmatch .
-This is an extension,
-compatible with but not specified by POSIX 1003.2,
-and should be used with
-caution in software intended to be portable to other systems.
-Note that a non-zero \fIrm_so\fR does not imply REG_NOTBOL;
-REG_STARTEND affects only the location of the string,
-not how it is matched.
-.PP
-See
-.ZR
-for a discussion of what is matched in situations where an RE or a
-portion thereof could match any of several substrings of
-.IR string .
-.PP
-Normally,
-.I regexec
-returns 0 for success and the non-zero code REG_NOMATCH for failure.
-Other non-zero error codes may be returned in exceptional situations;
-see DIAGNOSTICS.
-.PP
-If REG_NOSUB was specified in the compilation of the RE,
-or if
-.I nmatch
-is 0,
-.I regexec
-ignores the
-.I pmatch
-argument (but see below for the case where REG_STARTEND is specified).
-Otherwise,
-.I pmatch
-points to an array of
-.I nmatch
-structures of type
-.IR regmatch_t .
-Such a structure has at least the members
-.I rm_so
-and
-.IR rm_eo ,
-both of type
-.I regoff_t
-(a signed arithmetic type at least as large as an
-.I off_t
-and a
-.IR ssize_t ),
-containing respectively the offset of the first character of a substring
-and the offset of the first character after the end of the substring.
-Offsets are measured from the beginning of the
-.I string
-argument given to
-.IR regexec .
-An empty substring is denoted by equal offsets,
-both indicating the character following the empty substring.
-.PP
-The 0th member of the
-.I pmatch
-array is filled in to indicate what substring of
-.I string
-was matched by the entire RE.
-Remaining members report what substring was matched by parenthesized
-subexpressions within the RE;
-member
-.I i
-reports subexpression
-.IR i ,
-with subexpressions counted (starting at 1) by the order of their opening
-parentheses in the RE, left to right.
-Unused entries in the array\(emcorresponding either to subexpressions that
-did not participate in the match at all, or to subexpressions that do not
-exist in the RE (that is, \fIi\fR\ > \fIpreg\fR\->\fIre_nsub\fR)\(emhave both
-.I rm_so
-and
-.I rm_eo
-set to \-1.
-If a subexpression participated in the match several times,
-the reported substring is the last one it matched.
-(Note, as an example in particular, that when the RE `(b*)+' matches `bbb',
-the parenthesized subexpression matches the three `b's and then
-an infinite number of empty strings following the last `b',
-so the reported substring is one of the empties.)
-.PP
-If REG_STARTEND is specified,
-.I pmatch
-must point to at least one
-.I regmatch_t
-(even if
-.I nmatch
-is 0 or REG_NOSUB was specified),
-to hold the input offsets for REG_STARTEND.
-Use for output is still entirely controlled by
-.IR nmatch ;
-if
-.I nmatch
-is 0 or REG_NOSUB was specified,
-the value of
-.IR pmatch [0]
-will not be changed by a successful
-.IR regexec .
-.PP
-.I Regerror
-maps a non-zero
-.I errcode
-from either
-.I regcomp
-or
-.I regexec
-to a human-readable, printable message.
-If
-.I preg
-is non-NULL,
-the error code should have arisen from use of
-the
-.I regex_t
-pointed to by
-.IR preg ,
-and if the error code came from
-.IR regcomp ,
-it should have been the result from the most recent
-.I regcomp
-using that
-.IR regex_t .
-.RI ( Regerror
-may be able to supply a more detailed message using information
-from the
-.IR regex_t .)
-.I Regerror
-places the NUL-terminated message into the buffer pointed to by
-.IR errbuf ,
-limiting the length (including the NUL) to at most
-.I errbuf_size
-bytes.
-If the whole message won't fit,
-as much of it as will fit before the terminating NUL is supplied.
-In any case,
-the returned value is the size of buffer needed to hold the whole
-message (including terminating NUL).
-If
-.I errbuf_size
-is 0,
-.I errbuf
-is ignored but the return value is still correct.
-.PP
-If the
-.I errcode
-given to
-.I regerror
-is first ORed with REG_ITOA,
-the ``message'' that results is the printable name of the error code,
-e.g. ``REG_NOMATCH'',
-rather than an explanation thereof.
-If
-.I errcode
-is REG_ATOI,
-then
-.I preg
-shall be non-NULL and the
-.I re_endp
-member of the structure it points to
-must point to the printable name of an error code;
-in this case, the result in
-.I errbuf
-is the decimal digits of
-the numeric value of the error code
-(0 if the name is not recognized).
-REG_ITOA and REG_ATOI are intended primarily as debugging facilities;
-they are extensions,
-compatible with but not specified by POSIX 1003.2,
-and should be used with
-caution in software intended to be portable to other systems.
-Be warned also that they are considered experimental and changes are possible.
-.PP
-.I Regfree
-frees any dynamically-allocated storage associated with the compiled RE
-pointed to by
-.IR preg .
-The remaining
-.I regex_t
-is no longer a valid compiled RE
-and the effect of supplying it to
-.I regexec
-or
-.I regerror
-is undefined.
-.PP
-None of these functions references global variables except for tables
-of constants;
-all are safe for use from multiple threads if the arguments are safe.
-.SH IMPLEMENTATION CHOICES
-There are a number of decisions that 1003.2 leaves up to the implementor,
-either by explicitly saying ``undefined'' or by virtue of them being
-forbidden by the RE grammar.
-This implementation treats them as follows.
-.PP
-See
-.ZR
-for a discussion of the definition of case-independent matching.
-.PP
-There is no particular limit on the length of REs,
-except insofar as memory is limited.
-Memory usage is approximately linear in RE size, and largely insensitive
-to RE complexity, except for bounded repetitions.
-See BUGS for one short RE using them
-that will run almost any system out of memory.
-.PP
-A backslashed character other than one specifically given a magic meaning
-by 1003.2 (such magic meanings occur only in obsolete [``basic''] REs)
-is taken as an ordinary character.
-.PP
-Any unmatched [ is a REG_EBRACK error.
-.PP
-Equivalence classes cannot begin or end bracket-expression ranges.
-The endpoint of one range cannot begin another.
-.PP
-RE_DUP_MAX, the limit on repetition counts in bounded repetitions, is 255.
-.PP
-A repetition operator (?, *, +, or bounds) cannot follow another
-repetition operator.
-A repetition operator cannot begin an expression or subexpression
-or follow `^' or `|'.
-.PP
-`|' cannot appear first or last in a (sub)expression or after another `|',
-i.e. an operand of `|' cannot be an empty subexpression.
-An empty parenthesized subexpression, `()', is legal and matches an
-empty (sub)string.
-An empty string is not a legal RE.
-.PP
-A `{' followed by a digit is considered the beginning of bounds for a
-bounded repetition, which must then follow the syntax for bounds.
-A `{' \fInot\fR followed by a digit is considered an ordinary character.
-.PP
-`^' and `$' beginning and ending subexpressions in obsolete (``basic'')
-REs are anchors, not ordinary characters.
-.SH SEE ALSO
-grep(1), regex(7)
-.PP
-POSIX 1003.2, sections 2.8 (Regular Expression Notation)
-and
-B.5 (C Binding for Regular Expression Matching).
-.SH DIAGNOSTICS
-Non-zero error codes from
-.I regcomp
-and
-.I regexec
-include the following:
-.PP
-.nf
-.ta \w'REG_ECOLLATE'u+3n
-REG_NOMATCH    regexec() failed to match
-REG_BADPAT     invalid regular expression
-REG_ECOLLATE   invalid collating element
-REG_ECTYPE     invalid character class
-REG_EESCAPE    \e applied to unescapable character
-REG_ESUBREG    invalid backreference number
-REG_EBRACK     brackets [ ] not balanced
-REG_EPAREN     parentheses ( ) not balanced
-REG_EBRACE     braces { } not balanced
-REG_BADBR      invalid repetition count(s) in { }
-REG_ERANGE     invalid character range in [ ]
-REG_ESPACE     ran out of memory
-REG_BADRPT     ?, *, or + operand invalid
-REG_EMPTY      empty (sub)expression
-REG_ASSERT     ``can't happen''\(emyou found a bug
-REG_INVARG     invalid argument, e.g. negative-length string
-.fi
-.SH HISTORY
-Written by Henry Spencer,
-henry@zoo.toronto.edu.
-.SH BUGS
-This is an alpha release with known defects.
-Please report problems.
-.PP
-There is one known functionality bug.
-The implementation of internationalization is incomplete:
-the locale is always assumed to be the default one of 1003.2,
-and only the collating elements etc. of that locale are available.
-.PP
-The back-reference code is subtle and doubts linger about its correctness
-in complex cases.
-.PP
-.I Regexec
-performance is poor.
-This will improve with later releases.
-.I Nmatch
-exceeding 0 is expensive;
-.I nmatch
-exceeding 1 is worse.
-.I Regexec
-is largely insensitive to RE complexity \fIexcept\fR that back
-references are massively expensive.
-RE length does matter; in particular, there is a strong speed bonus
-for keeping RE length under about 30 characters,
-with most special characters counting roughly double.
-.PP
-.I Regcomp
-implements bounded repetitions by macro expansion,
-which is costly in time and space if counts are large
-or bounded repetitions are nested.
-An RE like, say,
-`((((a{1,100}){1,100}){1,100}){1,100}){1,100}'
-will (eventually) run almost any existing machine out of swap space.
-.PP
-There are suspected problems with response to obscure error conditions.
-Notably,
-certain kinds of internal overflow,
-produced only by truly enormous REs or by multiply nested bounded repetitions,
-are probably not handled well.
-.PP
-Due to a mistake in 1003.2, things like `a)b' are legal REs because `)' is
-a special character only in the presence of a previous unmatched `('.
-This can't be fixed until the spec is fixed.
-.PP
-The standard's definition of back references is vague.
-For example, does
-`a\e(\e(b\e)*\e2\e)*d' match `abbbd'?
-Until the standard is clarified,
-behavior in such cases should not be relied on.
-.PP
-The implementation of word-boundary matching is a bit of a kludge,
-and bugs may lurk in combinations of word-boundary matching and anchoring.
author	Václav Slavík <vslavik@fastmail.fm>
	Mon, 8 Mar 2004 09:41:41 +0000 (09:41 +0000)
committer	Václav Slavík <vslavik@fastmail.fm>
	Mon, 8 Mar 2004 09:41:41 +0000 (09:41 +0000)
src/regex/Makefile	[deleted file]	patch \| blob \| blame \| history
src/regex/WHATSNEW	[deleted file]	patch \| blob \| blame \| history
src/regex/engine.ih	[deleted file]	patch \| blob \| blame \| history
src/regex/mkh	[deleted file]	patch \| blob \| blame \| history
src/regex/regcomp.ih	[deleted file]	patch \| blob \| blame \| history
src/regex/regerror.ih	[deleted file]	patch \| blob \| blame \| history
src/regex/regex.3	[deleted file]	patch \| blob \| blame \| history