file_cmds-188.tar.gz

[apple/file_cmds.git] / compress / doc / README
diff --git a/compress/doc/README b/compress/doc/README

new file mode 100644 (file)

index 0000000..0828cdf
--- /dev/null
+++ b/compress/doc/README
@@ -0,0 +1,284 @@
+
+       @(#)README      8.1 (Berkeley) 6/9/93
+  $FreeBSD: src/usr.bin/compress/doc/README,v 1.3 2002/12/30 21:18:11 schweikh Exp $
+
+Compress version 4.0 improvements over 3.0:
+       o compress() speedup (10-50%) by changing division hash to xor
+       o decompress() speedup (5-10%)
+       o Memory requirements reduced (3-30%)
+       o Stack requirements reduced to less than 4kb
+       o Removed 'Big+Fast' compress code (FBITS) because of compress speedup
+       o Portability mods for Z8000 and PC/XT (but not zeus 3.2)
+       o Default to 'quiet' mode
+       o Unification of 'force' flags
+       o Manual page overhaul
+       o Portability enhancement for M_XENIX
+       o Removed text on #else and #endif
+       o Added "-V" switch to print version and options
+       o Added #defines for SIGNED_COMPARE_SLOW
+       o Added Makefile and "usermem" program
+       o Removed all floating point computations
+       o New programs: [deleted]
+
+The "usermem" script attempts to determine the maximum process size.  Some
+editing of the script may be necessary (see the comments).  [It should work
+fine on 4.3 BSD.] If you can't get it to work at all, just create file
+"USERMEM" containing the maximum process size in decimal.
+
+The following preprocessor symbols control the compilation of "compress.c":
+
+       o USERMEM               Maximum process memory on the system
+       o SACREDMEM             Amount to reserve for other processes
+       o SIGNED_COMPARE_SLOW   Unsigned compare instructions are faster
+       o NO_UCHAR              Don't use "unsigned char" types
+       o BITS                  Overrules default set by USERMEM-SACREDMEM
+       o vax                   Generate inline assembler
+       o interdata             Defines SIGNED_COMPARE_SLOW
+       o M_XENIX               Makes arrays < 65536 bytes each
+       o pdp11                 BITS=12, NO_UCHAR
+       o z8000                 BITS=12
+       o pcxt                  BITS=12
+       o BSD4_2                Allow long filenames ( > 14 characters) &
+                               Call setlinebuf(stderr)
+
+The difference "usermem-sacredmem" determines the maximum BITS that can be
+specified with the "-b" flag.
+
+memory: at least               BITS
+------  -- -----                ----
+     433,484                    16
+     229,600                    15
+     127,536                    14
+      73,464                    13
+           0                    12
+
+The default is BITS=16.
+
+The maximum bits can be overruled by specifying "-DBITS=bits" at
+compilation time.
+
+WARNING: files compressed on a large machine with more bits than allowed by 
+a version of compress on a smaller machine cannot be decompressed!  Use the
+"-b12" flag to generate a file on a large machine that can be uncompressed 
+on a 16-bit machine.
+
+The output of compress 4.0 is fully compatible with that of compress 3.0.
+In other words, the output of compress 4.0 may be fed into uncompress 3.0 or
+the output of compress 3.0 may be fed into uncompress 4.0.
+
+The output of compress 4.0 not compatible with that of
+compress 2.0.  However, compress 4.0 still accepts the output of
+compress 2.0.  To generate output that is compatible with compress
+2.0, use the undocumented "-C" flag.
+
+       -from mod.sources, submitted by vax135!petsd!joe (Joe Orost), 8/1/85
+--------------------------------
+
+Enclosed is compress version 3.0 with the following changes:
+
+1.     "Block" compression is performed.  After the BITS run out, the
+       compression ratio is checked every so often.  If it is decreasing,
+       the table is cleared and a new set of substrings are generated.
+
+       This makes the output of compress 3.0 not compatible with that of
+       compress 2.0.  However, compress 3.0 still accepts the output of
+       compress 2.0.  To generate output that is compatible with compress
+       2.0, use the undocumented "-C" flag.
+
+2.     A quiet "-q" flag has been added for use by the news system.
+
+3.     The character chaining has been deleted and the program now uses
+       hashing.  This improves the speed of the program, especially
+       during decompression.  Other speed improvements have been made,
+       such as using putc() instead of fwrite().
+
+4.     A large table is used on large machines when a relatively small
+       number of bits is specified.  This saves much time when compressing
+       for a 16-bit machine on a 32-bit virtual machine.  Note that the
+       speed improvement only occurs when the input file is > 30000
+       characters, and the -b BITS is less than or equal to the cutoff
+       described below.
+
+Most of these changes were made by James A. Woods (ames!jaw).  Thank you
+James!
+
+To compile compress:
+
+       cc -O -DUSERMEM=usermem -o compress compress.c
+
+Where "usermem" is the amount of physical user memory available (in bytes).  
+If any physical memory is to be reserved for other processes, put in 
+"-DSACREDMEM sacredmem", where "sacredmem" is the amount to be reserved.
+
+The difference "usermem-sacredmem" determines the maximum BITS that can be
+specified, and the cutoff bits where the large+fast table is used.
+
+memory: at least               BITS            cutoff
+------  -- -----                ----            ------
+   4,718,592                    16               13
+   2,621,440                    16               12
+   1,572,864                    16               11
+   1,048,576                    16               10
+     631,808                    16               --
+     329,728                    15               --
+     178,176                    14               --
+      99,328                    13               --
+           0                    12               --
+
+The default memory size is 750,000 which gives a maximum BITS=16 and no
+large+fast table.
+
+The maximum bits can be overruled by specifying "-DBITS=bits" at
+compilation time.
+
+If your machine doesn't support unsigned characters, define "NO_UCHAR" 
+when compiling.
+
+If your machine has "int" as 16-bits, define "SHORT_INT" when compiling.
+
+After compilation, move "compress" to a standard executable location, such 
+as /usr/local.  Then:
+       cd /usr/local
+       ln compress uncompress
+       ln compress zcat
+
+On machines that have a fixed stack size (such as Perkin-Elmer), set the
+stack to at least 12kb.  ("setstack compress 12" on Perkin-Elmer).
+
+Next, install the manual (compress.l).
+       cp compress.l /usr/man/manl
+       cd /usr/man/manl
+       ln compress.l uncompress.l
+       ln compress.l zcat.l
+
+               - or -
+
+       cp compress.l /usr/man/man1/compress.1
+       cd /usr/man/man1
+       ln compress.1 uncompress.1
+       ln compress.1 zcat.1
+
+                                       regards,
+                                       petsd!joe
+
+Here is a note from the net:
+
+>From hplabs!pesnta!amd!turtlevax!ken Sat Jan  5 03:35:20 1985
+Path: ames!hplabs!pesnta!amd!turtlevax!ken
+From: ken@turtlevax.UUCP (Ken Turkowski)
+Newsgroups: net.sources
+Subject: Re: Compress release 3.0 : sample Makefile
+Organization: CADLINC, Inc. @ Menlo Park, CA
+
+In the compress 3.0 source recently posted to mod.sources, there is a
+#define variable which can be set for optimum performance on a machine
+with a large amount of memory.  A program (usermem) to calculate the
+usable amount of physical user memory is enclosed, as well as a sample
+4.2BSD Vax Makefile for compress.
+
+Here is the README file from the previous version of compress (2.0):
+
+>Enclosed is compress.c version 2.0 with the following bugs fixed:
+>
+>1.    The packed files produced by compress are different on different
+>      machines and dependent on the vax sysgen option.
+>              The bug was in the different byte/bit ordering on the
+>              various machines.  This has been fixed.
+>
+>              This version is NOT compatible with the original vax posting
+>              unless the '-DCOMPATIBLE' option is specified to the C
+>              compiler.  The original posting has a bug which I fixed, 
+>              causing incompatible files.  I recommend you NOT to use this
+>              option unless you already have a lot of packed files from
+>              the original posting by Thomas.
+>2.    The exit status is not well defined (on some machines) causing the
+>      scripts to fail.
+>              The exit status is now 0,1 or 2 and is documented in
+>              compress.l.
+>3.    The function getopt() is not available in all C libraries.
+>              The function getopt() is no longer referenced by the
+>              program.
+>4.    Error status is not being checked on the fwrite() and fflush() calls.
+>              Fixed.
+>
+>The following enhancements have been made:
+>
+>1.    Added facilities of "compact" into the compress program.  "Pack",
+>      "Unpack", and "Pcat" are no longer required (no longer supplied).
+>2.    Installed work around for C compiler bug with "-O".
+>3.    Added a magic number header (\037\235).  Put the bits specified
+>      in the file.
+>4.    Added "-f" flag to force overwrite of output file.
+>5.    Added "-c" flag and "zcat" program.  'ln compress zcat' after you
+>      compile.
+>6.    The 'uncompress' script has been deleted; simply 
+>      'ln compress uncompress' after you compile and it will work.
+>7.    Removed extra bit masking for machines that support unsigned
+>      characters.  If your machine doesn't support unsigned characters,
+>      define "NO_UCHAR" when compiling.
+>
+>Compile "compress.c" with "-O -o compress" flags.  Move "compress" to a
+>standard executable location, such as /usr/local.  Then:
+>      cd /usr/local
+>      ln compress uncompress
+>      ln compress zcat
+>
+>On machines that have a fixed stack size (such as Perkin-Elmer), set the
+>stack to at least 12kb.  ("setstack compress 12" on Perkin-Elmer).
+>
+>Next, install the manual (compress.l).
+>      cp compress.l /usr/man/manl             - or -
+>      cp compress.l /usr/man/man1/compress.1
+>
+>Here is the README that I sent with my first posting:
+>
+>>Enclosed is a modified version of compress.c, along with scripts to make it
+>>run identically to pack(1), unpack(1), and pcat(1).  Here is what I
+>>(petsd!joe) and a colleague (petsd!peora!srd) did:
+>>
+>>1. Removed VAX dependencies.
+>>2. Changed the struct to separate arrays; saves mucho memory.
+>>3. Did comparisons in unsigned, where possible.  (Faster on Perkin-Elmer.)
+>>4. Sorted the character next chain and changed the search to stop
+>>prematurely.  This saves a lot on the execution time when compressing.
+>>
+>>This version is totally compatible with the original version.  Even though
+>>lint(1) -p has no complaints about compress.c, it won't run on a 16-bit
+>>machine, due to the size of the arrays.
+>>
+>>Here is the README file from the original author:
+>> 
+>>>Well, with all this discussion about file compression (for news batching
+>>>in particular) going around, I decided to implement the text compression
+>>>algorithm described in the June Computer magazine.  The author claimed
+>>>blinding speed and good compression ratios.  It's certainly faster than
+>>>compact (but, then, what wouldn't be), but it's also the same speed as
+>>>pack, and gets better compression than both of them.  On 350K bytes of
+>>>Unix-wizards, compact took about 8 minutes of CPU, pack took about 80
+>>>seconds, and compress (herein) also took 80 seconds.  But, compact and
+>>>pack got about 30% compression, whereas compress got over 50%.  So, I
+>>>decided I had something, and that others might be interested, too.
+>>>
+>>>As is probably true of compact and pack (although I haven't checked),
+>>>the byte order within a word is probably relevant here, but as long as
+>>>you stay on a single machine type, you should be ok.  (Can anybody
+>>>elucidate on this?)  There are a couple of asm's in the code (extv and
+>>>insv instructions), so anyone porting it to another machine will have to
+>>>deal with this anyway (and could probably make it compatible with Vax
+>>>byte order at the same time).  Anyway, I've linted the code (both with
+>>>and without -p), so it should run elsewhere.  Note the longs in the
+>>>code, you can take these out if you reduce BITS to <= 15.
+>>>
+>>>Have fun, and as always, if you make good enhancements, or bug fixes,
+>>>I'd like to see them.
+>>>
+>>>=Spencer (thomas@utah-20, {harpo,hplabs,arizona}!utah-cs!thomas)
+>>
+>>                                     regards,
+>>                                     joe
+>>
+>>--
+>>Full-Name:  Joseph M. Orost
+>>UUCP:       ..!{decvax,ucbvax,ihnp4}!vax135!petsd!joe
+>>US Mail:    MS 313; Perkin-Elmer; 106 Apple St; Tinton Falls, NJ 07724
+>>Phone:      (201) 870-5844