]>
Commit | Line | Data |
---|---|---|
1 | ||
2 | @(#)README 8.1 (Berkeley) 6/9/93 | |
3 | $FreeBSD: src/usr.bin/compress/doc/README,v 1.3 2002/12/30 21:18:11 schweikh Exp $ | |
4 | ||
5 | Compress version 4.0 improvements over 3.0: | |
6 | o compress() speedup (10-50%) by changing division hash to xor | |
7 | o decompress() speedup (5-10%) | |
8 | o Memory requirements reduced (3-30%) | |
9 | o Stack requirements reduced to less than 4kb | |
10 | o Removed 'Big+Fast' compress code (FBITS) because of compress speedup | |
11 | o Portability mods for Z8000 and PC/XT (but not zeus 3.2) | |
12 | o Default to 'quiet' mode | |
13 | o Unification of 'force' flags | |
14 | o Manual page overhaul | |
15 | o Portability enhancement for M_XENIX | |
16 | o Removed text on #else and #endif | |
17 | o Added "-V" switch to print version and options | |
18 | o Added #defines for SIGNED_COMPARE_SLOW | |
19 | o Added Makefile and "usermem" program | |
20 | o Removed all floating point computations | |
21 | o New programs: [deleted] | |
22 | ||
23 | The "usermem" script attempts to determine the maximum process size. Some | |
24 | editing of the script may be necessary (see the comments). [It should work | |
25 | fine on 4.3 BSD.] If you can't get it to work at all, just create file | |
26 | "USERMEM" containing the maximum process size in decimal. | |
27 | ||
28 | The following preprocessor symbols control the compilation of "compress.c": | |
29 | ||
30 | o USERMEM Maximum process memory on the system | |
31 | o SACREDMEM Amount to reserve for other processes | |
32 | o SIGNED_COMPARE_SLOW Unsigned compare instructions are faster | |
33 | o NO_UCHAR Don't use "unsigned char" types | |
34 | o BITS Overrules default set by USERMEM-SACREDMEM | |
35 | o vax Generate inline assembler | |
36 | o interdata Defines SIGNED_COMPARE_SLOW | |
37 | o M_XENIX Makes arrays < 65536 bytes each | |
38 | o pdp11 BITS=12, NO_UCHAR | |
39 | o z8000 BITS=12 | |
40 | o pcxt BITS=12 | |
41 | o BSD4_2 Allow long filenames ( > 14 characters) & | |
42 | Call setlinebuf(stderr) | |
43 | ||
44 | The difference "usermem-sacredmem" determines the maximum BITS that can be | |
45 | specified with the "-b" flag. | |
46 | ||
47 | memory: at least BITS | |
48 | ------ -- ----- ---- | |
49 | 433,484 16 | |
50 | 229,600 15 | |
51 | 127,536 14 | |
52 | 73,464 13 | |
53 | 0 12 | |
54 | ||
55 | The default is BITS=16. | |
56 | ||
57 | The maximum bits can be overruled by specifying "-DBITS=bits" at | |
58 | compilation time. | |
59 | ||
60 | WARNING: files compressed on a large machine with more bits than allowed by | |
61 | a version of compress on a smaller machine cannot be decompressed! Use the | |
62 | "-b12" flag to generate a file on a large machine that can be uncompressed | |
63 | on a 16-bit machine. | |
64 | ||
65 | The output of compress 4.0 is fully compatible with that of compress 3.0. | |
66 | In other words, the output of compress 4.0 may be fed into uncompress 3.0 or | |
67 | the output of compress 3.0 may be fed into uncompress 4.0. | |
68 | ||
69 | The output of compress 4.0 not compatible with that of | |
70 | compress 2.0. However, compress 4.0 still accepts the output of | |
71 | compress 2.0. To generate output that is compatible with compress | |
72 | 2.0, use the undocumented "-C" flag. | |
73 | ||
74 | -from mod.sources, submitted by vax135!petsd!joe (Joe Orost), 8/1/85 | |
75 | -------------------------------- | |
76 | ||
77 | Enclosed is compress version 3.0 with the following changes: | |
78 | ||
79 | 1. "Block" compression is performed. After the BITS run out, the | |
80 | compression ratio is checked every so often. If it is decreasing, | |
81 | the table is cleared and a new set of substrings are generated. | |
82 | ||
83 | This makes the output of compress 3.0 not compatible with that of | |
84 | compress 2.0. However, compress 3.0 still accepts the output of | |
85 | compress 2.0. To generate output that is compatible with compress | |
86 | 2.0, use the undocumented "-C" flag. | |
87 | ||
88 | 2. A quiet "-q" flag has been added for use by the news system. | |
89 | ||
90 | 3. The character chaining has been deleted and the program now uses | |
91 | hashing. This improves the speed of the program, especially | |
92 | during decompression. Other speed improvements have been made, | |
93 | such as using putc() instead of fwrite(). | |
94 | ||
95 | 4. A large table is used on large machines when a relatively small | |
96 | number of bits is specified. This saves much time when compressing | |
97 | for a 16-bit machine on a 32-bit virtual machine. Note that the | |
98 | speed improvement only occurs when the input file is > 30000 | |
99 | characters, and the -b BITS is less than or equal to the cutoff | |
100 | described below. | |
101 | ||
102 | Most of these changes were made by James A. Woods (ames!jaw). Thank you | |
103 | James! | |
104 | ||
105 | To compile compress: | |
106 | ||
107 | cc -O -DUSERMEM=usermem -o compress compress.c | |
108 | ||
109 | Where "usermem" is the amount of physical user memory available (in bytes). | |
110 | If any physical memory is to be reserved for other processes, put in | |
111 | "-DSACREDMEM sacredmem", where "sacredmem" is the amount to be reserved. | |
112 | ||
113 | The difference "usermem-sacredmem" determines the maximum BITS that can be | |
114 | specified, and the cutoff bits where the large+fast table is used. | |
115 | ||
116 | memory: at least BITS cutoff | |
117 | ------ -- ----- ---- ------ | |
118 | 4,718,592 16 13 | |
119 | 2,621,440 16 12 | |
120 | 1,572,864 16 11 | |
121 | 1,048,576 16 10 | |
122 | 631,808 16 -- | |
123 | 329,728 15 -- | |
124 | 178,176 14 -- | |
125 | 99,328 13 -- | |
126 | 0 12 -- | |
127 | ||
128 | The default memory size is 750,000 which gives a maximum BITS=16 and no | |
129 | large+fast table. | |
130 | ||
131 | The maximum bits can be overruled by specifying "-DBITS=bits" at | |
132 | compilation time. | |
133 | ||
134 | If your machine doesn't support unsigned characters, define "NO_UCHAR" | |
135 | when compiling. | |
136 | ||
137 | If your machine has "int" as 16-bits, define "SHORT_INT" when compiling. | |
138 | ||
139 | After compilation, move "compress" to a standard executable location, such | |
140 | as /usr/local. Then: | |
141 | cd /usr/local | |
142 | ln compress uncompress | |
143 | ln compress zcat | |
144 | ||
145 | On machines that have a fixed stack size (such as Perkin-Elmer), set the | |
146 | stack to at least 12kb. ("setstack compress 12" on Perkin-Elmer). | |
147 | ||
148 | Next, install the manual (compress.l). | |
149 | cp compress.l /usr/man/manl | |
150 | cd /usr/man/manl | |
151 | ln compress.l uncompress.l | |
152 | ln compress.l zcat.l | |
153 | ||
154 | - or - | |
155 | ||
156 | cp compress.l /usr/man/man1/compress.1 | |
157 | cd /usr/man/man1 | |
158 | ln compress.1 uncompress.1 | |
159 | ln compress.1 zcat.1 | |
160 | ||
161 | regards, | |
162 | petsd!joe | |
163 | ||
164 | Here is a note from the net: | |
165 | ||
166 | >From hplabs!pesnta!amd!turtlevax!ken Sat Jan 5 03:35:20 1985 | |
167 | Path: ames!hplabs!pesnta!amd!turtlevax!ken | |
168 | From: ken@turtlevax.UUCP (Ken Turkowski) | |
169 | Newsgroups: net.sources | |
170 | Subject: Re: Compress release 3.0 : sample Makefile | |
171 | Organization: CADLINC, Inc. @ Menlo Park, CA | |
172 | ||
173 | In the compress 3.0 source recently posted to mod.sources, there is a | |
174 | #define variable which can be set for optimum performance on a machine | |
175 | with a large amount of memory. A program (usermem) to calculate the | |
176 | usable amount of physical user memory is enclosed, as well as a sample | |
177 | 4.2BSD Vax Makefile for compress. | |
178 | ||
179 | Here is the README file from the previous version of compress (2.0): | |
180 | ||
181 | >Enclosed is compress.c version 2.0 with the following bugs fixed: | |
182 | > | |
183 | >1. The packed files produced by compress are different on different | |
184 | > machines and dependent on the vax sysgen option. | |
185 | > The bug was in the different byte/bit ordering on the | |
186 | > various machines. This has been fixed. | |
187 | > | |
188 | > This version is NOT compatible with the original vax posting | |
189 | > unless the '-DCOMPATIBLE' option is specified to the C | |
190 | > compiler. The original posting has a bug which I fixed, | |
191 | > causing incompatible files. I recommend you NOT to use this | |
192 | > option unless you already have a lot of packed files from | |
193 | > the original posting by Thomas. | |
194 | >2. The exit status is not well defined (on some machines) causing the | |
195 | > scripts to fail. | |
196 | > The exit status is now 0,1 or 2 and is documented in | |
197 | > compress.l. | |
198 | >3. The function getopt() is not available in all C libraries. | |
199 | > The function getopt() is no longer referenced by the | |
200 | > program. | |
201 | >4. Error status is not being checked on the fwrite() and fflush() calls. | |
202 | > Fixed. | |
203 | > | |
204 | >The following enhancements have been made: | |
205 | > | |
206 | >1. Added facilities of "compact" into the compress program. "Pack", | |
207 | > "Unpack", and "Pcat" are no longer required (no longer supplied). | |
208 | >2. Installed work around for C compiler bug with "-O". | |
209 | >3. Added a magic number header (\037\235). Put the bits specified | |
210 | > in the file. | |
211 | >4. Added "-f" flag to force overwrite of output file. | |
212 | >5. Added "-c" flag and "zcat" program. 'ln compress zcat' after you | |
213 | > compile. | |
214 | >6. The 'uncompress' script has been deleted; simply | |
215 | > 'ln compress uncompress' after you compile and it will work. | |
216 | >7. Removed extra bit masking for machines that support unsigned | |
217 | > characters. If your machine doesn't support unsigned characters, | |
218 | > define "NO_UCHAR" when compiling. | |
219 | > | |
220 | >Compile "compress.c" with "-O -o compress" flags. Move "compress" to a | |
221 | >standard executable location, such as /usr/local. Then: | |
222 | > cd /usr/local | |
223 | > ln compress uncompress | |
224 | > ln compress zcat | |
225 | > | |
226 | >On machines that have a fixed stack size (such as Perkin-Elmer), set the | |
227 | >stack to at least 12kb. ("setstack compress 12" on Perkin-Elmer). | |
228 | > | |
229 | >Next, install the manual (compress.l). | |
230 | > cp compress.l /usr/man/manl - or - | |
231 | > cp compress.l /usr/man/man1/compress.1 | |
232 | > | |
233 | >Here is the README that I sent with my first posting: | |
234 | > | |
235 | >>Enclosed is a modified version of compress.c, along with scripts to make it | |
236 | >>run identically to pack(1), unpack(1), and pcat(1). Here is what I | |
237 | >>(petsd!joe) and a colleague (petsd!peora!srd) did: | |
238 | >> | |
239 | >>1. Removed VAX dependencies. | |
240 | >>2. Changed the struct to separate arrays; saves mucho memory. | |
241 | >>3. Did comparisons in unsigned, where possible. (Faster on Perkin-Elmer.) | |
242 | >>4. Sorted the character next chain and changed the search to stop | |
243 | >>prematurely. This saves a lot on the execution time when compressing. | |
244 | >> | |
245 | >>This version is totally compatible with the original version. Even though | |
246 | >>lint(1) -p has no complaints about compress.c, it won't run on a 16-bit | |
247 | >>machine, due to the size of the arrays. | |
248 | >> | |
249 | >>Here is the README file from the original author: | |
250 | >> | |
251 | >>>Well, with all this discussion about file compression (for news batching | |
252 | >>>in particular) going around, I decided to implement the text compression | |
253 | >>>algorithm described in the June Computer magazine. The author claimed | |
254 | >>>blinding speed and good compression ratios. It's certainly faster than | |
255 | >>>compact (but, then, what wouldn't be), but it's also the same speed as | |
256 | >>>pack, and gets better compression than both of them. On 350K bytes of | |
257 | >>>Unix-wizards, compact took about 8 minutes of CPU, pack took about 80 | |
258 | >>>seconds, and compress (herein) also took 80 seconds. But, compact and | |
259 | >>>pack got about 30% compression, whereas compress got over 50%. So, I | |
260 | >>>decided I had something, and that others might be interested, too. | |
261 | >>> | |
262 | >>>As is probably true of compact and pack (although I haven't checked), | |
263 | >>>the byte order within a word is probably relevant here, but as long as | |
264 | >>>you stay on a single machine type, you should be ok. (Can anybody | |
265 | >>>elucidate on this?) There are a couple of asm's in the code (extv and | |
266 | >>>insv instructions), so anyone porting it to another machine will have to | |
267 | >>>deal with this anyway (and could probably make it compatible with Vax | |
268 | >>>byte order at the same time). Anyway, I've linted the code (both with | |
269 | >>>and without -p), so it should run elsewhere. Note the longs in the | |
270 | >>>code, you can take these out if you reduce BITS to <= 15. | |
271 | >>> | |
272 | >>>Have fun, and as always, if you make good enhancements, or bug fixes, | |
273 | >>>I'd like to see them. | |
274 | >>> | |
275 | >>>=Spencer (thomas@utah-20, {harpo,hplabs,arizona}!utah-cs!thomas) | |
276 | >> | |
277 | >> regards, | |
278 | >> joe | |
279 | >> | |
280 | >>-- | |
281 | >>Full-Name: Joseph M. Orost | |
282 | >>UUCP: ..!{decvax,ucbvax,ihnp4}!vax135!petsd!joe | |
283 | >>US Mail: MS 313; Perkin-Elmer; 106 Apple St; Tinton Falls, NJ 07724 | |
284 | >>Phone: (201) 870-5844 |