2         @(#)README      8.1 (Berkeley) 6/9/93
 
   3   $FreeBSD: src/usr.bin/compress/doc/README,v 1.3 2002/12/30 21:18:11 schweikh Exp $
 
   5 Compress version 4.0 improvements over 3.0:
 
   6         o compress() speedup (10-50%) by changing division hash to xor
 
   7         o decompress() speedup (5-10%)
 
   8         o Memory requirements reduced (3-30%)
 
   9         o Stack requirements reduced to less than 4kb
 
  10         o Removed 'Big+Fast' compress code (FBITS) because of compress speedup
 
  11         o Portability mods for Z8000 and PC/XT (but not zeus 3.2)
 
  12         o Default to 'quiet' mode
 
  13         o Unification of 'force' flags
 
  14         o Manual page overhaul
 
  15         o Portability enhancement for M_XENIX
 
  16         o Removed text on #else and #endif
 
  17         o Added "-V" switch to print version and options
 
  18         o Added #defines for SIGNED_COMPARE_SLOW
 
  19         o Added Makefile and "usermem" program
 
  20         o Removed all floating point computations
 
  21         o New programs: [deleted]
 
  23 The "usermem" script attempts to determine the maximum process size.  Some
 
  24 editing of the script may be necessary (see the comments).  [It should work
 
  25 fine on 4.3 BSD.] If you can't get it to work at all, just create file
 
  26 "USERMEM" containing the maximum process size in decimal.
 
  28 The following preprocessor symbols control the compilation of "compress.c":
 
  30         o USERMEM               Maximum process memory on the system
 
  31         o SACREDMEM             Amount to reserve for other processes
 
  32         o SIGNED_COMPARE_SLOW   Unsigned compare instructions are faster
 
  33         o NO_UCHAR              Don't use "unsigned char" types
 
  34         o BITS                  Overrules default set by USERMEM-SACREDMEM
 
  35         o vax                   Generate inline assembler
 
  36         o interdata             Defines SIGNED_COMPARE_SLOW
 
  37         o M_XENIX               Makes arrays < 65536 bytes each
 
  38         o pdp11                 BITS=12, NO_UCHAR
 
  41         o BSD4_2                Allow long filenames ( > 14 characters) &
 
  42                                 Call setlinebuf(stderr)
 
  44 The difference "usermem-sacredmem" determines the maximum BITS that can be
 
  45 specified with the "-b" flag.
 
  55 The default is BITS=16.
 
  57 The maximum bits can be overruled by specifying "-DBITS=bits" at
 
  60 WARNING: files compressed on a large machine with more bits than allowed by 
 
  61 a version of compress on a smaller machine cannot be decompressed!  Use the
 
  62 "-b12" flag to generate a file on a large machine that can be uncompressed 
 
  65 The output of compress 4.0 is fully compatible with that of compress 3.0.
 
  66 In other words, the output of compress 4.0 may be fed into uncompress 3.0 or
 
  67 the output of compress 3.0 may be fed into uncompress 4.0.
 
  69 The output of compress 4.0 not compatible with that of
 
  70 compress 2.0.  However, compress 4.0 still accepts the output of
 
  71 compress 2.0.  To generate output that is compatible with compress
 
  72 2.0, use the undocumented "-C" flag.
 
  74         -from mod.sources, submitted by vax135!petsd!joe (Joe Orost), 8/1/85
 
  75 --------------------------------
 
  77 Enclosed is compress version 3.0 with the following changes:
 
  79 1.      "Block" compression is performed.  After the BITS run out, the
 
  80         compression ratio is checked every so often.  If it is decreasing,
 
  81         the table is cleared and a new set of substrings are generated.
 
  83         This makes the output of compress 3.0 not compatible with that of
 
  84         compress 2.0.  However, compress 3.0 still accepts the output of
 
  85         compress 2.0.  To generate output that is compatible with compress
 
  86         2.0, use the undocumented "-C" flag.
 
  88 2.      A quiet "-q" flag has been added for use by the news system.
 
  90 3.      The character chaining has been deleted and the program now uses
 
  91         hashing.  This improves the speed of the program, especially
 
  92         during decompression.  Other speed improvements have been made,
 
  93         such as using putc() instead of fwrite().
 
  95 4.      A large table is used on large machines when a relatively small
 
  96         number of bits is specified.  This saves much time when compressing
 
  97         for a 16-bit machine on a 32-bit virtual machine.  Note that the
 
  98         speed improvement only occurs when the input file is > 30000
 
  99         characters, and the -b BITS is less than or equal to the cutoff
 
 102 Most of these changes were made by James A. Woods (ames!jaw).  Thank you
 
 107         cc -O -DUSERMEM=usermem -o compress compress.c
 
 109 Where "usermem" is the amount of physical user memory available (in bytes).  
 
 110 If any physical memory is to be reserved for other processes, put in 
 
 111 "-DSACREDMEM sacredmem", where "sacredmem" is the amount to be reserved.
 
 113 The difference "usermem-sacredmem" determines the maximum BITS that can be
 
 114 specified, and the cutoff bits where the large+fast table is used.
 
 116 memory: at least                BITS            cutoff
 
 117 ------  -- -----                ----            ------
 
 128 The default memory size is 750,000 which gives a maximum BITS=16 and no
 
 131 The maximum bits can be overruled by specifying "-DBITS=bits" at
 
 134 If your machine doesn't support unsigned characters, define "NO_UCHAR" 
 
 137 If your machine has "int" as 16-bits, define "SHORT_INT" when compiling.
 
 139 After compilation, move "compress" to a standard executable location, such 
 
 142         ln compress uncompress
 
 145 On machines that have a fixed stack size (such as Perkin-Elmer), set the
 
 146 stack to at least 12kb.  ("setstack compress 12" on Perkin-Elmer).
 
 148 Next, install the manual (compress.l).
 
 149         cp compress.l /usr/man/manl
 
 151         ln compress.l uncompress.l
 
 156         cp compress.l /usr/man/man1/compress.1
 
 158         ln compress.1 uncompress.1
 
 164 Here is a note from the net:
 
 166 >From hplabs!pesnta!amd!turtlevax!ken Sat Jan  5 03:35:20 1985
 
 167 Path: ames!hplabs!pesnta!amd!turtlevax!ken
 
 168 From: ken@turtlevax.UUCP (Ken Turkowski)
 
 169 Newsgroups: net.sources
 
 170 Subject: Re: Compress release 3.0 : sample Makefile
 
 171 Organization: CADLINC, Inc. @ Menlo Park, CA
 
 173 In the compress 3.0 source recently posted to mod.sources, there is a
 
 174 #define variable which can be set for optimum performance on a machine
 
 175 with a large amount of memory.  A program (usermem) to calculate the
 
 176 usable amount of physical user memory is enclosed, as well as a sample
 
 177 4.2BSD Vax Makefile for compress.
 
 179 Here is the README file from the previous version of compress (2.0):
 
 181 >Enclosed is compress.c version 2.0 with the following bugs fixed:
 
 183 >1.     The packed files produced by compress are different on different
 
 184 >       machines and dependent on the vax sysgen option.
 
 185 >               The bug was in the different byte/bit ordering on the
 
 186 >               various machines.  This has been fixed.
 
 188 >               This version is NOT compatible with the original vax posting
 
 189 >               unless the '-DCOMPATIBLE' option is specified to the C
 
 190 >               compiler.  The original posting has a bug which I fixed, 
 
 191 >               causing incompatible files.  I recommend you NOT to use this
 
 192 >               option unless you already have a lot of packed files from
 
 193 >               the original posting by Thomas.
 
 194 >2.     The exit status is not well defined (on some machines) causing the
 
 196 >               The exit status is now 0,1 or 2 and is documented in
 
 198 >3.     The function getopt() is not available in all C libraries.
 
 199 >               The function getopt() is no longer referenced by the
 
 201 >4.     Error status is not being checked on the fwrite() and fflush() calls.
 
 204 >The following enhancements have been made:
 
 206 >1.     Added facilities of "compact" into the compress program.  "Pack",
 
 207 >       "Unpack", and "Pcat" are no longer required (no longer supplied).
 
 208 >2.     Installed work around for C compiler bug with "-O".
 
 209 >3.     Added a magic number header (\037\235).  Put the bits specified
 
 211 >4.     Added "-f" flag to force overwrite of output file.
 
 212 >5.     Added "-c" flag and "zcat" program.  'ln compress zcat' after you
 
 214 >6.     The 'uncompress' script has been deleted; simply 
 
 215 >       'ln compress uncompress' after you compile and it will work.
 
 216 >7.     Removed extra bit masking for machines that support unsigned
 
 217 >       characters.  If your machine doesn't support unsigned characters,
 
 218 >       define "NO_UCHAR" when compiling.
 
 220 >Compile "compress.c" with "-O -o compress" flags.  Move "compress" to a
 
 221 >standard executable location, such as /usr/local.  Then:
 
 223 >       ln compress uncompress
 
 226 >On machines that have a fixed stack size (such as Perkin-Elmer), set the
 
 227 >stack to at least 12kb.  ("setstack compress 12" on Perkin-Elmer).
 
 229 >Next, install the manual (compress.l).
 
 230 >       cp compress.l /usr/man/manl             - or -
 
 231 >       cp compress.l /usr/man/man1/compress.1
 
 233 >Here is the README that I sent with my first posting:
 
 235 >>Enclosed is a modified version of compress.c, along with scripts to make it
 
 236 >>run identically to pack(1), unpack(1), and pcat(1).  Here is what I
 
 237 >>(petsd!joe) and a colleague (petsd!peora!srd) did:
 
 239 >>1. Removed VAX dependencies.
 
 240 >>2. Changed the struct to separate arrays; saves mucho memory.
 
 241 >>3. Did comparisons in unsigned, where possible.  (Faster on Perkin-Elmer.)
 
 242 >>4. Sorted the character next chain and changed the search to stop
 
 243 >>prematurely.  This saves a lot on the execution time when compressing.
 
 245 >>This version is totally compatible with the original version.  Even though
 
 246 >>lint(1) -p has no complaints about compress.c, it won't run on a 16-bit
 
 247 >>machine, due to the size of the arrays.
 
 249 >>Here is the README file from the original author:
 
 251 >>>Well, with all this discussion about file compression (for news batching
 
 252 >>>in particular) going around, I decided to implement the text compression
 
 253 >>>algorithm described in the June Computer magazine.  The author claimed
 
 254 >>>blinding speed and good compression ratios.  It's certainly faster than
 
 255 >>>compact (but, then, what wouldn't be), but it's also the same speed as
 
 256 >>>pack, and gets better compression than both of them.  On 350K bytes of
 
 257 >>>Unix-wizards, compact took about 8 minutes of CPU, pack took about 80
 
 258 >>>seconds, and compress (herein) also took 80 seconds.  But, compact and
 
 259 >>>pack got about 30% compression, whereas compress got over 50%.  So, I
 
 260 >>>decided I had something, and that others might be interested, too.
 
 262 >>>As is probably true of compact and pack (although I haven't checked),
 
 263 >>>the byte order within a word is probably relevant here, but as long as
 
 264 >>>you stay on a single machine type, you should be ok.  (Can anybody
 
 265 >>>elucidate on this?)  There are a couple of asm's in the code (extv and
 
 266 >>>insv instructions), so anyone porting it to another machine will have to
 
 267 >>>deal with this anyway (and could probably make it compatible with Vax
 
 268 >>>byte order at the same time).  Anyway, I've linted the code (both with
 
 269 >>>and without -p), so it should run elsewhere.  Note the longs in the
 
 270 >>>code, you can take these out if you reduce BITS to <= 15.
 
 272 >>>Have fun, and as always, if you make good enhancements, or bug fixes,
 
 273 >>>I'd like to see them.
 
 275 >>>=Spencer (thomas@utah-20, {harpo,hplabs,arizona}!utah-cs!thomas)
 
 281 >>Full-Name:  Joseph M. Orost
 
 282 >>UUCP:       ..!{decvax,ucbvax,ihnp4}!vax135!petsd!joe
 
 283 >>US Mail:    MS 313; Perkin-Elmer; 106 Apple St; Tinton Falls, NJ 07724
 
 284 >>Phone:      (201) 870-5844