]> git.saurik.com Git - apt.git/blame_incremental - doc/method.dbk
acquire: Use priority queues and a 3 stage pipeline design
[apt.git] / doc / method.dbk
... / ...
CommitLineData
1<?xml version="1.0" encoding="UTF-8"?>
2<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
3 "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd" [
4<!ENTITY % aptent SYSTEM "apt.ent"> %aptent;
5<!ENTITY % aptverbatiment SYSTEM "apt-verbatim.ent"> %aptverbatiment;
6<!ENTITY % aptvendor SYSTEM "apt-vendor.ent"> %aptvendor;
7]>
8
9<book lang="en">
10
11<title>APT Method Interface</title>
12
13<bookinfo>
14
15<authorgroup>
16 <author>
17 <personname>Jason Gunthorpe</personname><email>jgg@debian.org</email>
18 </author>
19</authorgroup>
20
21<releaseinfo>Version &apt-product-version;</releaseinfo>
22
23<abstract>
24<para>
25This document describes the interface that APT uses to the archive access
26methods.
27</para>
28</abstract>
29
30<copyright><year>1998</year><holder>Jason Gunthorpe</holder></copyright>
31
32<legalnotice>
33<title>License Notice</title>
34<para>
35"APT" and this document are free software; you can redistribute them and/or
36modify them under the terms of the GNU General Public License as published by
37the Free Software Foundation; either version 2 of the License, or (at your
38option) any later version.
39</para>
40<para>
41For more details, on Debian systems, see the file
42/usr/share/common-licenses/GPL for the full license.
43</para>
44</legalnotice>
45
46</bookinfo>
47
48<chapter id="ch1"><title>Introduction</title>
49
50<section id="s1.1"><title>General</title>
51<para>
52The APT method interface allows APT to acquire archive files (.deb), index
53files (Packages, Release, Mirrors) and source files (.tar.gz, .diff). It is a
54general, extensible system designed to satisfy all of these requirements:
55</para>
56<orderedlist numeration="arabic">
57<listitem>
58<para>
59Remote methods that download files from a distant site
60</para>
61</listitem>
62<listitem>
63<para>
64Resume of aborted downloads
65</para>
66</listitem>
67<listitem>
68<para>
69Progress reporting
70</para>
71</listitem>
72<listitem>
73<para>
74If-Modified-Since (IMS) checking for index files
75</para>
76</listitem>
77<listitem>
78<para>
79In-Line MD5 generation
80</para>
81</listitem>
82<listitem>
83<para>
84No-copy in-filesystem methods
85</para>
86</listitem>
87<listitem>
88<para>
89Multi-media methods (like CD's)
90</para>
91</listitem>
92<listitem>
93<para>
94Dynamic source selection for failure recovery
95</para>
96</listitem>
97<listitem>
98<para>
99User interaction for user/password requests and media swaps
100</para>
101</listitem>
102<listitem>
103<para>
104Global configuration
105</para>
106</listitem>
107</orderedlist>
108<para>
109Initial releases of APT (0.1.x) used a completely different method interface
110that only supported the first 6 items. This new interface deals with the
111remainder.
112</para>
113</section>
114
115<section id="s1.2"><title>Terms</title>
116<para>
117Several terms are used through out the document, they have specific meanings
118which may not be immediately evident. To clarify they are summarized here.
119</para>
120<variablelist>
121<varlistentry>
122<term>source</term>
123<listitem>
124<para>
125Refers to an item in source list. More specifically it is the broken down
126item, that is each source maps to exactly one index file. Archive sources map
127to Package files and Source Code sources map to Source files.
128</para>
129</listitem>
130</varlistentry>
131<varlistentry>
132<term>archive file</term>
133<listitem>
134<para>
135Refers to a binary package archive (.deb, .rpm, etc).
136</para>
137</listitem>
138</varlistentry>
139<varlistentry>
140<term>source file</term>
141<listitem>
142<para>
143Refers to one of the files making up the source code of a package. In debian
144it is one of .diff.gz, .dsc. or .tar.gz.
145</para>
146</listitem>
147</varlistentry>
148<varlistentry>
149<term>URI</term>
150<listitem>
151<para>
152Universal Resource Identifier (URI) is a super-set of the familiar URL
153syntax used by web browsers. It consists of an access specification
154followed by a specific location in that access space. The form is
155&lt;access&gt;:&lt;location&gt;. Network addresses are given with the form
156&lt;access&gt;://[&lt;user&gt;[:&lt;pas&gt;]@]hostname[:port]/&lt;location&gt;.
157Some examples:
158</para>
159<screen>
160file:/var/mirrors/debian/
161ftp://ftp.debian.org/debian
162ftp://jgg:MooCow@localhost:21/debian
163nfs://bigred/var/mirrors/debian
164rsync://debian.midco.net/debian
165cdrom:Debian 2.0r1 Disk 1/
166</screen>
167</listitem>
168</varlistentry>
169<varlistentry>
170<term>method</term>
171<listitem>
172<para>
173There is a one to one mapping of URI access specifiers to methods. A method is
174a program that knows how to handle a URI access type and operates according to
175the specifications in this file.
176</para>
177</listitem>
178</varlistentry>
179<varlistentry>
180<term>method instance</term>
181<listitem>
182<para>
183A specific running method. There can be more than one instance of each method
184as APT is capable of concurrent method handling.
185</para>
186</listitem>
187</varlistentry>
188<varlistentry>
189<term>message</term>
190<listitem>
191<para>
192A series of lines terminated by a blank line sent down one of the communication
193lines. The first line should have the form xxx TAG where xxx are digits
194forming the status code and TAG is an informational string
195</para>
196</listitem>
197</varlistentry>
198<varlistentry>
199<term>acquire</term>
200<listitem>
201<para>
202The act of bring a URI into the local pathname space. This may simply be
203verifying the existence of the URI or actually downloading it from a remote
204site.
205</para>
206</listitem>
207</varlistentry>
208</variablelist>
209</section>
210
211</chapter>
212
213<chapter id="ch2"><title>Specification</title>
214
215<section id="s2.1"><title>Overview</title>
216<para>
217All methods operate as a sub process of a main controlling parent. 3 FD's are
218opened for use by the method allowing two way communication and emergency error
219reporting. The FD's correspond to the well known unix FD's, stdin, stdout and
220stderr.
221</para>
222<para>
223Through operation of the method communication is done via http style plain
224text. Specifically RFC-822 (like the Package file) fields are used to describe
225items and a numeric-like header is used to indicate what is happening. Each of
226these distinct communication messages should be sent quickly and without pause.
227</para>
228<para>
229In some instances APT may pre-invoke a method to allow things like file URI's
230to determine how many files are available locally.
231</para>
232</section>
233
234<section id="s2.2"><title>Message Overview</title>
235<para>
236The first line of each message is called the message header. The first 3
237digits (called the Status Code) have the usual meaning found in the http
238protocol. 1xx is informational, 2xx is successful and 4xx is failure. The 6xx
239series is used to specify things sent to the method. After the status code is
240an informational string provided for visual debugging.
241</para>
242<itemizedlist>
243<listitem>
244<para>
245100 Capabilities - Method capabilities
246</para>
247</listitem>
248<listitem>
249<para>
250101 Log - General Logging
251</para>
252</listitem>
253<listitem>
254<para>
255102 Status - Inter-URI status reporting (login progress)
256</para>
257</listitem>
258<listitem>
259<para>
260200 URI Start - URI is starting acquire
261</para>
262</listitem>
263<listitem>
264<para>
265201 URI Done - URI is finished acquire
266</para>
267</listitem>
268<listitem>
269<para>
270400 URI Failure - URI has failed to acquire
271</para>
272</listitem>
273<listitem>
274<para>
275401 General Failure - Method did not like something sent to it
276</para>
277</listitem>
278<listitem>
279<para>
280402 Authorization Required - Method requires authorization to access the URI.
281Authorization is User/Pass
282</para>
283</listitem>
284<listitem>
285<para>
286403 Media Failure - Method requires a media change
287</para>
288</listitem>
289<listitem>
290<para>
291600 URI Acquire - Request a URI be acquired
292</para>
293</listitem>
294<listitem>
295<para>
296601 Configuration - Sends the configuration space
297</para>
298</listitem>
299<listitem>
300<para>
301602 Authorization Credentials - Response to the 402 message
302</para>
303</listitem>
304<listitem>
305<para>
306603 Media Changed - Response to the 403 message
307</para>
308</listitem>
309</itemizedlist>
310<para>
311Only the 6xx series of status codes is sent TO the method. Furthermore the
312method may not emit status codes in the 6xx range. The Codes 402 and 403
313require that the method continue reading all other 6xx codes until the proper
314602/603 code is received. This means the method must be capable of handling an
315unlimited number of 600 messages.
316</para>
317<para>
318The flow of messages starts with the method sending out a <emphasis>100
319Capabilities</emphasis> and APT sending out a <emphasis>601
320Configuration</emphasis>. After that APT begins sending <emphasis>600 URI
321Acquire</emphasis> and the method sends out <emphasis>200 URI Start</emphasis>,
322<emphasis>201 URI Done</emphasis> or <emphasis>400 URI Failure</emphasis>. No
323synchronization is performed, it is expected that APT will send <emphasis>600
324URI Acquire</emphasis> messages at -any- time and that the method should queue
325the messages. This allows methods like http to pipeline requests to the remote
326server. It should be noted however that APT will buffer messages so it is not
327necessary for the method to be constantly ready to receive them.
328</para>
329</section>
330
331<section id="s2.3"><title>Header Fields</title>
332<para>
333The following is a short index of the header fields that are supported
334</para>
335<variablelist>
336<varlistentry>
337<term>URI</term>
338<listitem>
339<para>
340URI being described by the message
341</para>
342</listitem>
343</varlistentry>
344<varlistentry>
345<term>Filename</term>
346<listitem>
347<para>
348Location in the filesystem
349</para>
350</listitem>
351</varlistentry>
352<varlistentry>
353<term>Last-Modified</term>
354<listitem>
355<para>
356A time stamp in RFC1123 notation for use by IMS checks
357</para>
358</listitem>
359</varlistentry>
360<varlistentry>
361<term>IMS-Hit</term>
362<listitem>
363<para>
364The already existing item is valid
365</para>
366</listitem>
367</varlistentry>
368<varlistentry>
369<term>Size</term>
370<listitem>
371<para>
372Size of the file in bytes
373</para>
374</listitem>
375</varlistentry>
376<varlistentry>
377<term>Resume-Point</term>
378<listitem>
379<para>
380Location that transfer was started
381</para>
382</listitem>
383</varlistentry>
384<varlistentry>
385<term>MD5-Hash</term>
386<listitem>
387<para>
388Computed MD5 hash for the file
389</para>
390</listitem>
391</varlistentry>
392<varlistentry>
393<term>Message</term>
394<listitem>
395<para>
396String indicating some displayable message
397</para>
398</listitem>
399</varlistentry>
400<varlistentry>
401<term>Media</term>
402<listitem>
403<para>
404String indicating the media name required
405</para>
406</listitem>
407</varlistentry>
408<varlistentry>
409<term>Site</term>
410<listitem>
411<para>
412String indicating the site authorization is required for
413</para>
414</listitem>
415</varlistentry>
416<varlistentry>
417<term>User</term>
418<listitem>
419<para>
420Username for authorization
421</para>
422</listitem>
423</varlistentry>
424<varlistentry>
425<term>Password</term>
426<listitem>
427<para>
428Password for authorization
429</para>
430</listitem>
431</varlistentry>
432<varlistentry>
433<term>Fail</term>
434<listitem>
435<para>
436Operation failed
437</para>
438</listitem>
439</varlistentry>
440<varlistentry>
441<term>Drive</term>
442<listitem>
443<para>
444Drive the media should be placed in
445</para>
446</listitem>
447</varlistentry>
448<varlistentry>
449<term>Config-Item</term>
450<listitem>
451<para>
452A string of the form
453<replaceable>item</replaceable>=<replaceable>value</replaceable> derived from
454the APT configuration space. These may include method specific values and
455general values not related to the method. It is up to the method to filter out
456the ones it wants.
457</para>
458</listitem>
459</varlistentry>
460<varlistentry>
461<term>Single-Instance</term>
462<listitem>
463<para>
464Requires that only one instance of the method be run This is a yes/no value.
465</para>
466</listitem>
467</varlistentry>
468<varlistentry>
469<term>Pipeline</term>
470<listitem>
471<para>
472The method is capable of pipelining.
473</para>
474</listitem>
475</varlistentry>
476<varlistentry>
477<term>Local</term>
478<listitem>
479<para>
480The method only returns Filename: fields.
481</para>
482</listitem>
483</varlistentry>
484<varlistentry>
485<term>Send-Config</term>
486<listitem>
487<para>
488Send configuration to the method.
489</para>
490</listitem>
491</varlistentry>
492<varlistentry>
493<term>Needs-Cleanup</term>
494<listitem>
495<para>
496The process is kept around while the files it returned are being used. This is
497primarily intended for CD-ROM and File URIs that need to unmount filesystems.
498</para>
499</listitem>
500</varlistentry>
501<varlistentry>
502<term>Version</term>
503<listitem>
504<para>
505Version string for the method
506</para>
507</listitem>
508</varlistentry>
509</variablelist>
510<para>
511This is a list of which headers each status code can use
512</para>
513<variablelist>
514<varlistentry>
515<term>100 Capabilities</term>
516<listitem>
517<para>
518Displays the capabilities of the method. Methods should set the pipeline bit
519if their underlying protocol supports pipelining. The only known method that
520does support pipelining is http. Fields: Version, Single-Instance, Pre-Scan,
521Pipeline, Send-Config, Needs-Cleanup
522</para>
523</listitem>
524</varlistentry>
525<varlistentry>
526<term>101 Log</term>
527<listitem>
528<para>
529A log message may be printed to the screen if debugging is enabled. This is
530only for debugging the method. Fields: Message
531</para>
532</listitem>
533</varlistentry>
534<varlistentry>
535<term>102 Status</term>
536<listitem>
537<para>
538Message gives a progress indication for the method. It can be used to show
539pre-transfer status for Internet type methods. Fields: Message
540</para>
541</listitem>
542</varlistentry>
543<varlistentry>
544<term>200 URI Start</term>
545<listitem>
546<para>
547Indicates the URI is starting to be transferred. The URI is specified along
548with stats about the file itself. Fields: URI, Size, Last-Modified,
549Resume-Point
550</para>
551</listitem>
552</varlistentry>
553<varlistentry>
554<term>201 URI Done</term>
555<listitem>
556<para>
557Indicates that a URI has completed being transferred. It is possible to
558specify a <emphasis>201 URI Done</emphasis> without a <emphasis>URI
559Start</emphasis> which would mean no data was transferred but the file is now
560available. A Filename field is specified when the URI is directly available in
561the local pathname space. APT will either directly use that file or copy it
562into another location. It is possible to return Alt-* fields to indicate that
563another possibility for the URI has been found in the local pathname space.
564This is done if a decompressed version of a .gz file is found. Fields: URI,
565Size, Last-Modified, Filename, MD5-Hash
566</para>
567</listitem>
568</varlistentry>
569<varlistentry>
570<term>400 URI Failure</term>
571<listitem>
572<para>
573Indicates a fatal URI failure. The URI is not retrievable from this source. As
574with <emphasis>201 URI Done</emphasis> <emphasis>200 URI Start</emphasis> is
575not required to precede this message Fields: URI, Message
576</para>
577</listitem>
578</varlistentry>
579<varlistentry>
580<term>401 General Failure</term>
581<listitem>
582<para>
583Indicates that some unspecific failure has occurred and the method is unable
584to continue. The method should terminate after sending this message. It
585is intended to check for invalid configuration options or other severe
586conditions. Fields: Message
587</para>
588</listitem>
589</varlistentry>
590<varlistentry>
591<term>402 Authorization Required</term>
592<listitem>
593<para>
594The method requires a Username and Password pair to continue. After sending
595this message the method will expect APT to send a <emphasis>602 Authorization
596Credentials</emphasis> message with the required information. It is possible
597for a method to send this multiple times. Fields: Site
598</para>
599</listitem>
600</varlistentry>
601<varlistentry>
602<term>403 Media Failure</term>
603<listitem>
604<para>
605A method that deals with multiple media requires that a new media be
606inserted. The Media field contains the name of the media to be
607inserted. Fields: Media, Drive
608</para>
609</listitem>
610</varlistentry>
611<varlistentry>
612<term>600 URI Acquire</term>
613<listitem>
614<para>
615APT is requesting that a new URI be added to the acquire list. Last-Modified
616has the time stamp of the currently cache file if applicable. Filename is the
617name of the file that the acquired URI should be written to. Fields: URI,
618Filename Last-Modified
619</para>
620</listitem>
621</varlistentry>
622<varlistentry>
623<term>601 Configuration</term>
624<listitem>
625<para>
626APT is sending the configuration space to the method. A series of Config-Item
627fields will be part of this message, each containing an entry from the
628configuration space. Fields: Config-Item.
629</para>
630</listitem>
631</varlistentry>
632<varlistentry>
633<term>602 Authorization Credentials</term>
634<listitem>
635<para>
636This is sent in response to a <emphasis>402 Authorization Required</emphasis>
637message. It contains the entered username and password. Fields: Site, User,
638Password
639</para>
640</listitem>
641</varlistentry>
642<varlistentry>
643<term>603 Media Changed</term>
644<listitem>
645<para>
646This is sent in response to a <emphasis>403 Media Failure</emphasis>
647message. It indicates that the user has changed media and it is safe
648to proceed. Fields: Media, Fail
649</para>
650</listitem>
651</varlistentry>
652</variablelist>
653</section>
654
655<section id="s2.4"><title>Notes</title>
656<para>
657The methods supplied by the stock apt are:
658</para>
659<orderedlist numeration="arabic">
660<listitem>
661<para>
662cdrom - For Multi-Disc CD-ROMs
663</para>
664</listitem>
665<listitem>
666<para>
667copy - (internal) For copying files around the filesystem
668</para>
669</listitem>
670<listitem>
671<para>
672file - For local files
673</para>
674</listitem>
675<listitem>
676<para>
677gzip - (internal) For decompression
678</para>
679</listitem>
680<listitem>
681<para>
682http - For HTTP servers
683</para>
684</listitem>
685</orderedlist>
686<para>
687The two internal methods, copy and gzip, are used by the acquire code to
688parallize and simplify the automatic decompression of package files as well as
689copying package files around the file system. Both methods can be seen to act
690the same except that one decompresses on the fly. APT uses them by generating
691a copy URI that is formed identically to a file URI. The destination file is
692send as normal. The method then takes the file specified by the URI and writes
693it to the destination file. A typical set of operations may be:
694</para>
695<screen>
696http://foo.com/Packages.gz -&gt; /bar/Packages.gz
697gzip:/bar/Packages.gz -&gt; /bar/Packages.decomp
698rename Packages.decomp to /final/Packages
699</screen>
700<para>
701The http method implements a fully featured HTTP/1.1 client that supports
702deep pipelining and reget. It works best when coupled with an apache 1.3
703server. The file method simply generates failures or success responses
704with the filename field set to the proper location. The cdrom method acts
705the same except that it checks that the mount point has a valid cdrom in
706it. It does this by (effectively) computing a md5 hash of 'ls -l' on the
707mountpoint.
708</para>
709</section>
710
711</chapter>
712
713</book>