]> git.saurik.com Git - apt.git/blob - doc/method.dbk
bunch of micro-optimizations for depcache
[apt.git] / doc / method.dbk
1 <?xml version="1.0" encoding="UTF-8"?>
2 <!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
3 "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd" [
4 <!ENTITY % aptent SYSTEM "apt.ent"> %aptent;
5 <!ENTITY % aptverbatiment SYSTEM "apt-verbatim.ent"> %aptverbatiment;
6 <!ENTITY % aptvendor SYSTEM "apt-vendor.ent"> %aptvendor;
7 ]>
8
9 <book lang="en">
10
11 <title>APT Method Interface</title>
12
13 <bookinfo>
14
15 <authorgroup>
16 <author>
17 <personname>Jason Gunthorpe</personname><email>jgg@debian.org</email>
18 </author>
19 </authorgroup>
20
21 <releaseinfo>Version &apt-product-version;</releaseinfo>
22
23 <abstract>
24 <para>
25 This document describes the interface that APT uses to the archive access
26 methods.
27 </para>
28 </abstract>
29
30 <copyright><year>1998</year><holder>Jason Gunthorpe</holder></copyright>
31
32 <legalnotice>
33 <title>License Notice</title>
34 <para>
35 "APT" and this document are free software; you can redistribute them and/or
36 modify them under the terms of the GNU General Public License as published by
37 the Free Software Foundation; either version 2 of the License, or (at your
38 option) any later version.
39 </para>
40 <para>
41 For more details, on Debian systems, see the file
42 /usr/share/common-licenses/GPL for the full license.
43 </para>
44 </legalnotice>
45
46 </bookinfo>
47
48 <chapter id="ch1"><title>Introduction</title>
49
50 <section id="s1.1"><title>General</title>
51 <para>
52 The APT method interface allows APT to acquire archive files (.deb), index
53 files (Packages, Release, Mirrors) and source files (.tar.gz, .diff). It is a
54 general, extensible system designed to satisfy all of these requirements:
55 </para>
56 <orderedlist numeration="arabic">
57 <listitem>
58 <para>
59 Remote methods that download files from a distant site
60 </para>
61 </listitem>
62 <listitem>
63 <para>
64 Resume of aborted downloads
65 </para>
66 </listitem>
67 <listitem>
68 <para>
69 Progress reporting
70 </para>
71 </listitem>
72 <listitem>
73 <para>
74 If-Modified-Since (IMS) checking for index files
75 </para>
76 </listitem>
77 <listitem>
78 <para>
79 In-Line MD5 generation
80 </para>
81 </listitem>
82 <listitem>
83 <para>
84 No-copy in-filesystem methods
85 </para>
86 </listitem>
87 <listitem>
88 <para>
89 Multi-media methods (like CD's)
90 </para>
91 </listitem>
92 <listitem>
93 <para>
94 Dynamic source selection for failure recovery
95 </para>
96 </listitem>
97 <listitem>
98 <para>
99 User interaction for user/password requests and media swaps
100 </para>
101 </listitem>
102 <listitem>
103 <para>
104 Global configuration
105 </para>
106 </listitem>
107 </orderedlist>
108 <para>
109 Initial releases of APT (0.1.x) used a completely different method interface
110 that only supported the first 6 items. This new interface deals with the
111 remainder.
112 </para>
113 </section>
114
115 <section id="s1.2"><title>Terms</title>
116 <para>
117 Several terms are used through out the document, they have specific meanings
118 which may not be immediately evident. To clarify they are summarized here.
119 </para>
120 <variablelist>
121 <varlistentry>
122 <term>source</term>
123 <listitem>
124 <para>
125 Refers to an item in source list. More specifically it is the broken down
126 item, that is each source maps to exactly one index file. Archive sources map
127 to Package files and Source Code sources map to Source files.
128 </para>
129 </listitem>
130 </varlistentry>
131 <varlistentry>
132 <term>archive file</term>
133 <listitem>
134 <para>
135 Refers to a binary package archive (.deb, .rpm, etc).
136 </para>
137 </listitem>
138 </varlistentry>
139 <varlistentry>
140 <term>source file</term>
141 <listitem>
142 <para>
143 Refers to one of the files making up the source code of a package. In debian
144 it is one of .diff.gz, .dsc. or .tar.gz.
145 </para>
146 </listitem>
147 </varlistentry>
148 <varlistentry>
149 <term>URI</term>
150 <listitem>
151 <para>
152 Universal Resource Identifier (URI) is a super-set of the familiar URL
153 syntax used by web browsers. It consists of an access specification
154 followed by a specific location in that access space. The form is
155 &lt;access&gt;:&lt;location&gt;. Network addresses are given with the form
156 &lt;access&gt;://[&lt;user&gt;[:&lt;pas&gt;]@]hostname[:port]/&lt;location&gt;.
157 Some examples:
158 </para>
159 <screen>
160 file:/var/mirrors/debian/
161 ftp://ftp.debian.org/debian
162 ftp://jgg:MooCow@localhost:21/debian
163 nfs://bigred/var/mirrors/debian
164 rsync://debian.midco.net/debian
165 cdrom:Debian 2.0r1 Disk 1/
166 </screen>
167 </listitem>
168 </varlistentry>
169 <varlistentry>
170 <term>method</term>
171 <listitem>
172 <para>
173 There is a one to one mapping of URI access specifiers to methods. A method is
174 a program that knows how to handle a URI access type and operates according to
175 the specifications in this file.
176 </para>
177 </listitem>
178 </varlistentry>
179 <varlistentry>
180 <term>method instance</term>
181 <listitem>
182 <para>
183 A specific running method. There can be more than one instance of each method
184 as APT is capable of concurrent method handling.
185 </para>
186 </listitem>
187 </varlistentry>
188 <varlistentry>
189 <term>message</term>
190 <listitem>
191 <para>
192 A series of lines terminated by a blank line sent down one of the communication
193 lines. The first line should have the form xxx TAG where xxx are digits
194 forming the status code and TAG is an informational string
195 </para>
196 </listitem>
197 </varlistentry>
198 <varlistentry>
199 <term>acquire</term>
200 <listitem>
201 <para>
202 The act of bring a URI into the local pathname space. This may simply be
203 verifying the existence of the URI or actually downloading it from a remote
204 site.
205 </para>
206 </listitem>
207 </varlistentry>
208 </variablelist>
209 </section>
210
211 </chapter>
212
213 <chapter id="ch2"><title>Specification</title>
214
215 <section id="s2.1"><title>Overview</title>
216 <para>
217 All methods operate as a sub process of a main controlling parent. 3 FD's are
218 opened for use by the method allowing two way communication and emergency error
219 reporting. The FD's correspond to the well known unix FD's, stdin, stdout and
220 stderr.
221 </para>
222 <para>
223 Through operation of the method communication is done via http style plain
224 text. Specifically RFC-822 (like the Package file) fields are used to describe
225 items and a numeric-like header is used to indicate what is happening. Each of
226 these distinct communication messages should be sent quickly and without pause.
227 </para>
228 <para>
229 In some instances APT may pre-invoke a method to allow things like file URI's
230 to determine how many files are available locally.
231 </para>
232 </section>
233
234 <section id="s2.2"><title>Message Overview</title>
235 <para>
236 The first line of each message is called the message header. The first 3
237 digits (called the Status Code) have the usual meaning found in the http
238 protocol. 1xx is informational, 2xx is successful and 4xx is failure. The 6xx
239 series is used to specify things sent to the method. After the status code is
240 an informational string provided for visual debugging.
241 </para>
242 <itemizedlist>
243 <listitem>
244 <para>
245 100 Capabilities - Method capabilities
246 </para>
247 </listitem>
248 <listitem>
249 <para>
250 101 Log - General Logging
251 </para>
252 </listitem>
253 <listitem>
254 <para>
255 102 Status - Inter-URI status reporting (login progress)
256 </para>
257 </listitem>
258 <listitem>
259 <para>
260 200 URI Start - URI is starting acquire
261 </para>
262 </listitem>
263 <listitem>
264 <para>
265 201 URI Done - URI is finished acquire
266 </para>
267 </listitem>
268 <listitem>
269 <para>
270 400 URI Failure - URI has failed to acquire
271 </para>
272 </listitem>
273 <listitem>
274 <para>
275 401 General Failure - Method did not like something sent to it
276 </para>
277 </listitem>
278 <listitem>
279 <para>
280 402 Authorization Required - Method requires authorization to access the URI.
281 Authorization is User/Pass
282 </para>
283 </listitem>
284 <listitem>
285 <para>
286 403 Media Failure - Method requires a media change
287 </para>
288 </listitem>
289 <listitem>
290 <para>
291 600 URI Acquire - Request a URI be acquired
292 </para>
293 </listitem>
294 <listitem>
295 <para>
296 601 Configuration - Sends the configuration space
297 </para>
298 </listitem>
299 <listitem>
300 <para>
301 602 Authorization Credentials - Response to the 402 message
302 </para>
303 </listitem>
304 <listitem>
305 <para>
306 603 Media Changed - Response to the 403 message
307 </para>
308 </listitem>
309 </itemizedlist>
310 <para>
311 Only the 6xx series of status codes is sent TO the method. Furthermore the
312 method may not emit status codes in the 6xx range. The Codes 402 and 403
313 require that the method continue reading all other 6xx codes until the proper
314 602/603 code is received. This means the method must be capable of handling an
315 unlimited number of 600 messages.
316 </para>
317 <para>
318 The flow of messages starts with the method sending out a <emphasis>100
319 Capabilities</emphasis> and APT sending out a <emphasis>601
320 Configuration</emphasis>. After that APT begins sending <emphasis>600 URI
321 Acquire</emphasis> and the method sends out <emphasis>200 URI Start</emphasis>,
322 <emphasis>201 URI Done</emphasis> or <emphasis>400 URI Failure</emphasis>. No
323 synchronization is performed, it is expected that APT will send <emphasis>600
324 URI Acquire</emphasis> messages at -any- time and that the method should queue
325 the messages. This allows methods like http to pipeline requests to the remote
326 server. It should be noted however that APT will buffer messages so it is not
327 necessary for the method to be constantly ready to receive them.
328 </para>
329 </section>
330
331 <section id="s2.3"><title>Header Fields</title>
332 <para>
333 The following is a short index of the header fields that are supported
334 </para>
335 <variablelist>
336 <varlistentry>
337 <term>URI</term>
338 <listitem>
339 <para>
340 URI being described by the message
341 </para>
342 </listitem>
343 </varlistentry>
344 <varlistentry>
345 <term>Filename</term>
346 <listitem>
347 <para>
348 Location in the filesystem
349 </para>
350 </listitem>
351 </varlistentry>
352 <varlistentry>
353 <term>Last-Modified</term>
354 <listitem>
355 <para>
356 A time stamp in RFC1123 notation for use by IMS checks
357 </para>
358 </listitem>
359 </varlistentry>
360 <varlistentry>
361 <term>IMS-Hit</term>
362 <listitem>
363 <para>
364 The already existing item is valid
365 </para>
366 </listitem>
367 </varlistentry>
368 <varlistentry>
369 <term>Size</term>
370 <listitem>
371 <para>
372 Size of the file in bytes
373 </para>
374 </listitem>
375 </varlistentry>
376 <varlistentry>
377 <term>Resume-Point</term>
378 <listitem>
379 <para>
380 Location that transfer was started
381 </para>
382 </listitem>
383 </varlistentry>
384 <varlistentry>
385 <term>MD5-Hash</term>
386 <listitem>
387 <para>
388 Computed MD5 hash for the file
389 </para>
390 </listitem>
391 </varlistentry>
392 <varlistentry>
393 <term>Message</term>
394 <listitem>
395 <para>
396 String indicating some displayable message
397 </para>
398 </listitem>
399 </varlistentry>
400 <varlistentry>
401 <term>Media</term>
402 <listitem>
403 <para>
404 String indicating the media name required
405 </para>
406 </listitem>
407 </varlistentry>
408 <varlistentry>
409 <term>Site</term>
410 <listitem>
411 <para>
412 String indicating the site authorization is required for
413 </para>
414 </listitem>
415 </varlistentry>
416 <varlistentry>
417 <term>User</term>
418 <listitem>
419 <para>
420 Username for authorization
421 </para>
422 </listitem>
423 </varlistentry>
424 <varlistentry>
425 <term>Password</term>
426 <listitem>
427 <para>
428 Password for authorization
429 </para>
430 </listitem>
431 </varlistentry>
432 <varlistentry>
433 <term>Fail</term>
434 <listitem>
435 <para>
436 Operation failed
437 </para>
438 </listitem>
439 </varlistentry>
440 <varlistentry>
441 <term>Drive</term>
442 <listitem>
443 <para>
444 Drive the media should be placed in
445 </para>
446 </listitem>
447 </varlistentry>
448 <varlistentry>
449 <term>Config-Item</term>
450 <listitem>
451 <para>
452 A string of the form
453 <replaceable>item</replaceable>=<replaceable>value</replaceable> derived from
454 the APT configuration space. These may include method specific values and
455 general values not related to the method. It is up to the method to filter out
456 the ones it wants.
457 </para>
458 </listitem>
459 </varlistentry>
460 <varlistentry>
461 <term>Single-Instance</term>
462 <listitem>
463 <para>
464 Requires that only one instance of the method be run This is a yes/no value.
465 </para>
466 </listitem>
467 </varlistentry>
468 <varlistentry>
469 <term>Pipeline</term>
470 <listitem>
471 <para>
472 The method is capable of pipelining.
473 </para>
474 </listitem>
475 </varlistentry>
476 <varlistentry>
477 <term>Local</term>
478 <listitem>
479 <para>
480 The method only returns Filename: fields.
481 </para>
482 </listitem>
483 </varlistentry>
484 <varlistentry>
485 <term>Send-Config</term>
486 <listitem>
487 <para>
488 Send configuration to the method.
489 </para>
490 </listitem>
491 </varlistentry>
492 <varlistentry>
493 <term>Needs-Cleanup</term>
494 <listitem>
495 <para>
496 The process is kept around while the files it returned are being used. This is
497 primarily intended for CD-ROM and File URIs that need to unmount filesystems.
498 </para>
499 </listitem>
500 </varlistentry>
501 <varlistentry>
502 <term>Version</term>
503 <listitem>
504 <para>
505 Version string for the method
506 </para>
507 </listitem>
508 </varlistentry>
509 </variablelist>
510 <para>
511 This is a list of which headers each status code can use
512 </para>
513 <variablelist>
514 <varlistentry>
515 <term>100 Capabilities</term>
516 <listitem>
517 <para>
518 Displays the capabilities of the method. Methods should set the pipeline bit
519 if their underlying protocol supports pipelining. The only known method that
520 does support pipelining is http. Fields: Version, Single-Instance, Pre-Scan,
521 Pipeline, Send-Config, Needs-Cleanup
522 </para>
523 </listitem>
524 </varlistentry>
525 <varlistentry>
526 <term>101 Log</term>
527 <listitem>
528 <para>
529 A log message may be printed to the screen if debugging is enabled. This is
530 only for debugging the method. Fields: Message
531 </para>
532 </listitem>
533 </varlistentry>
534 <varlistentry>
535 <term>102 Status</term>
536 <listitem>
537 <para>
538 Message gives a progress indication for the method. It can be used to show
539 pre-transfer status for Internet type methods. Fields: Message
540 </para>
541 </listitem>
542 </varlistentry>
543 <varlistentry>
544 <term>200 URI Start</term>
545 <listitem>
546 <para>
547 Indicates the URI is starting to be transferred. The URI is specified along
548 with stats about the file itself. Fields: URI, Size, Last-Modified,
549 Resume-Point
550 </para>
551 </listitem>
552 </varlistentry>
553 <varlistentry>
554 <term>201 URI Done</term>
555 <listitem>
556 <para>
557 Indicates that a URI has completed being transferred. It is possible to
558 specify a <emphasis>201 URI Done</emphasis> without a <emphasis>URI
559 Start</emphasis> which would mean no data was transferred but the file is now
560 available. A Filename field is specified when the URI is directly available in
561 the local pathname space. APT will either directly use that file or copy it
562 into another location. It is possible to return Alt-* fields to indicate that
563 another possibility for the URI has been found in the local pathname space.
564 This is done if a decompressed version of a .gz file is found. Fields: URI,
565 Size, Last-Modified, Filename, MD5-Hash
566 </para>
567 </listitem>
568 </varlistentry>
569 <varlistentry>
570 <term>400 URI Failure</term>
571 <listitem>
572 <para>
573 Indicates a fatal URI failure. The URI is not retrievable from this source. As
574 with <emphasis>201 URI Done</emphasis> <emphasis>200 URI Start</emphasis> is
575 not required to precede this message Fields: URI, Message
576 </para>
577 </listitem>
578 </varlistentry>
579 <varlistentry>
580 <term>401 General Failure</term>
581 <listitem>
582 <para>
583 Indicates that some unspecific failure has occurred and the method is unable
584 to continue. The method should terminate after sending this message. It
585 is intended to check for invalid configuration options or other severe
586 conditions. Fields: Message
587 </para>
588 </listitem>
589 </varlistentry>
590 <varlistentry>
591 <term>402 Authorization Required</term>
592 <listitem>
593 <para>
594 The method requires a Username and Password pair to continue. After sending
595 this message the method will expect APT to send a <emphasis>602 Authorization
596 Credentials</emphasis> message with the required information. It is possible
597 for a method to send this multiple times. Fields: Site
598 </para>
599 </listitem>
600 </varlistentry>
601 <varlistentry>
602 <term>403 Media Failure</term>
603 <listitem>
604 <para>
605 A method that deals with multiple media requires that a new media be
606 inserted. The Media field contains the name of the media to be
607 inserted. Fields: Media, Drive
608 </para>
609 </listitem>
610 </varlistentry>
611 <varlistentry>
612 <term>600 URI Acquire</term>
613 <listitem>
614 <para>
615 APT is requesting that a new URI be added to the acquire list. Last-Modified
616 has the time stamp of the currently cache file if applicable. Filename is the
617 name of the file that the acquired URI should be written to. Fields: URI,
618 Filename Last-Modified
619 </para>
620 </listitem>
621 </varlistentry>
622 <varlistentry>
623 <term>601 Configuration</term>
624 <listitem>
625 <para>
626 APT is sending the configuration space to the method. A series of Config-Item
627 fields will be part of this message, each containing an entry from the
628 configuration space. Fields: Config-Item.
629 </para>
630 </listitem>
631 </varlistentry>
632 <varlistentry>
633 <term>602 Authorization Credentials</term>
634 <listitem>
635 <para>
636 This is sent in response to a <emphasis>402 Authorization Required</emphasis>
637 message. It contains the entered username and password. Fields: Site, User,
638 Password
639 </para>
640 </listitem>
641 </varlistentry>
642 <varlistentry>
643 <term>603 Media Changed</term>
644 <listitem>
645 <para>
646 This is sent in response to a <emphasis>403 Media Failure</emphasis>
647 message. It indicates that the user has changed media and it is safe
648 to proceed. Fields: Media, Fail
649 </para>
650 </listitem>
651 </varlistentry>
652 </variablelist>
653 </section>
654
655 <section id="s2.4"><title>Notes</title>
656 <para>
657 The methods supplied by the stock apt are:
658 </para>
659 <orderedlist numeration="arabic">
660 <listitem>
661 <para>
662 cdrom - For Multi-Disc CD-ROMs
663 </para>
664 </listitem>
665 <listitem>
666 <para>
667 copy - (internal) For copying files around the filesystem
668 </para>
669 </listitem>
670 <listitem>
671 <para>
672 file - For local files
673 </para>
674 </listitem>
675 <listitem>
676 <para>
677 gzip - (internal) For decompression
678 </para>
679 </listitem>
680 <listitem>
681 <para>
682 http - For HTTP servers
683 </para>
684 </listitem>
685 </orderedlist>
686 <para>
687 The two internal methods, copy and gzip, are used by the acquire code to
688 parallize and simplify the automatic decompression of package files as well as
689 copying package files around the file system. Both methods can be seen to act
690 the same except that one decompresses on the fly. APT uses them by generating
691 a copy URI that is formed identically to a file URI. The destination file is
692 send as normal. The method then takes the file specified by the URI and writes
693 it to the destination file. A typical set of operations may be:
694 </para>
695 <screen>
696 http://foo.com/Packages.gz -&gt; /bar/Packages.gz
697 gzip:/bar/Packages.gz -&gt; /bar/Packages.decomp
698 rename Packages.decomp to /final/Packages
699 </screen>
700 <para>
701 The http method implements a fully featured HTTP/1.1 client that supports
702 deep pipelining and reget. It works best when coupled with an apache 1.3
703 server. The file method simply generates failures or success responses
704 with the filename field set to the proper location. The cdrom method acts
705 the same except that it checks that the mount point has a valid cdrom in
706 it. It does this by (effectively) computing a md5 hash of 'ls -l' on the
707 mountpoint.
708 </para>
709 </section>
710
711 </chapter>
712
713 </book>