.\" SUCH DAMAGE.
.\"
.\" @(#)regex.3 8.4 (Berkeley) 3/20/94
-.\" $FreeBSD: src/lib/libc/regex/regex.3,v 1.9 2001/10/01 16:08:58 ru Exp $
+.\" $FreeBSD: src/lib/libc/regex/regex.3,v 1.17 2004/07/12 11:03:42 tjr Exp $
.\"
-.Dd March 20, 1994
+.Dd July 12, 2004
.Dt REGEX 3
.Os
.Sh NAME
.Nm regcomp ,
-.Nm regexec ,
.Nm regerror ,
+.Nm regexec ,
.Nm regfree
.Nd regular-expression library
.Sh LIBRARY
.Lb libc
.Sh SYNOPSIS
-.In sys/types.h
.In regex.h
.Ft int
-.Fn regcomp "regex_t *preg" "const char *pattern" "int cflags"
-.Ft int
-.Fo regexec
-.Fa "const regex_t *preg" "const char *string"
-.Fa "size_t nmatch" "regmatch_t pmatch[]" "int eflags"
+.Fo regcomp
+.Fa "regex_t *restrict preg"
+.Fa "const char *restrict pattern"
+.Fa "int cflags"
.Fc
.Ft size_t
.Fo regerror
-.Fa "int errcode" "const regex_t *preg"
-.Fa "char *errbuf" "size_t errbuf_size"
+.Fa "int errcode"
+.Fa "const regex_t *restrict preg"
+.Fa "char *restrict errbuf"
+.Fa "size_t errbuf_size"
+.Fc
+.Ft int
+.Fo regexec
+.Fa "const regex_t *restrict preg"
+.Fa "const char *restrict string"
+.Fa "size_t nmatch"
+.Fa "regmatch_t pmatch[restrict]"
+.Fa "int eflags"
.Fc
.Ft void
-.Fn regfree "regex_t *preg"
+.Fo regfree
+.Fa "regex_t *preg"
+.Fc
.Sh DESCRIPTION
These routines implement
.St -p1003.2
.Pq Do RE Dc Ns s ;
see
.Xr re_format 7 .
-.Fn Regcomp
-compiles an RE written as a string into an internal form,
+The
+.Fn regcomp
+function
+compiles an RE, written as a string, into an internal form.
.Fn regexec
-matches that internal form against a string and reports results,
+matches that internal form against a string and reports results.
.Fn regerror
-transforms error codes from either into human-readable messages,
-and
+transforms error codes from either into human-readable messages.
.Fn regfree
frees any dynamically-allocated storage used by the internal form
of an RE.
.Pp
The header
-.Aq Pa regex.h
+.In regex.h
declares two structure types,
.Ft regex_t
and
and a number of constants with names starting with
.Dq Dv REG_ .
.Pp
-.Fn Regcomp
+The
+.Fn regcomp
+function
compiles the regular expression contained in the
.Fa pattern
string,
.Ft regex_t
structure pointed to by
.Fa preg .
-.Fa Cflags
+The
+.Fa cflags
+argument
is the bitwise OR of zero or more of the following flags:
.Bl -tag -width REG_EXTENDED
.It Dv REG_EXTENDED
see
.Sx DIAGNOSTICS .
.Pp
-.Fn Regexec
+The
+.Fn regexec
+function
matches the compiled RE pointed to by
.Fa preg
against the
will not be changed by a successful
.Fn regexec .
.Pp
-.Fn Regerror
+The
+.Fn regerror
+function
maps a non-zero
.Fa errcode
from either
.Fn regcomp
using that
.Ft regex_t .
-.No ( Fn Regerror
+The
+.Fn ( regerror
may be able to supply a more detailed message using information
from the
.Ft regex_t . )
-.Fn Regerror
+The
+.Fn regerror
+function
places the NUL-terminated message into the buffer pointed to by
.Fa errbuf ,
limiting the length (including the NUL) to at most
caution in software intended to be portable to other systems.
Be warned also that they are considered experimental and changes are possible.
.Pp
-.Fn Regfree
+The
+.Fn regfree
+function
frees any dynamically-allocated storage associated with the compiled RE
pointed to by
.Fa preg .
.Ql |\&
cannot appear first or last in a (sub)expression or after another
.Ql |\& ,
-i.e. an operand of
+i.e., an operand of
.Ql |\&
cannot be an empty subexpression.
An empty parenthesized subexpression,
.Pp
.Bl -tag -width REG_ECOLLATE -compact
.It Dv REG_NOMATCH
+The
.Fn regexec
+function
failed to match
.It Dv REG_BADPAT
invalid regular expression
.It Dv REG_ASSERT
can't happen - you found a bug
.It Dv REG_INVARG
-invalid argument, e.g. negative-length string
+invalid argument, e.g.\& negative-length string
+.It Dv REG_ILLSEQ
+illegal byte sequence (bad multibyte character)
.El
.Sh HISTORY
Originally written by
The back-reference code is subtle and doubts linger about its correctness
in complex cases.
.Pp
-.Fn Regexec
+The
+.Fn regexec
+function
performance is poor.
This will improve with later releases.
-.Fa Nmatch
+The
+.Fa nmatch
+argument
exceeding 0 is expensive;
.Fa nmatch
exceeding 1 is worse.
-.Fn Regexec
+The
+.Fn regexec
+function
is largely insensitive to RE complexity
.Em except
that back
for keeping RE length under about 30 characters,
with most special characters counting roughly double.
.Pp
-.Fn Regcomp
+The
+.Fn regcomp
+function
implements bounded repetitions by macro expansion,
which is costly in time and space if counts are large
or bounded repetitions are nested.