implement POC client-side merging of pdiffs via apt-file
The idea of pdiffs is to avoid downloading the hole file by patching the
existing index. This works very well, but becomes slow if a lot of
patches needs to be applied to reconstruct an up-to-date index and in
recent years more and more dinstall (or similar) runs are executed
creating more and more pdiffs in the same amount of time, so pdiffs
became less useful.
The solution is simple: Reduce the amount of patches (which are very
small) which need to be applied on top of the index we have available
(which is usually pretty big).
This can be done in two ways: Either merge the patches on the
server-side so that the client has to download only one patch or the
patches are all downloaded and merged on the client-side.
The first needs a client who is doing one step at a time who can also
skip patches if it needs (APT supports this for a long time now).
The later is implemented by this commit, but depends on the server NOT
merging the patches and the patches being in a strict order in which no
patch is skipped.
This is traditionally the case for dak, but other repository creators
support merging – e.g. reprepro (which helpfully adds a flag indicating
that the patches are merged). To support both or even mixes a client
needs more information which isn't available for now.
This POC uses the external diffindex-rred included in apt-file to
do the heavy lifting of merging & applying all patches in one pass,
hence to test this feature apt-file needs to be installed.
Call xmllint with each vendor to check if any vendor specific errors are
present, but check the translations only with one vendor to check for
translation specifics – vendor and translation specific isn't possible.
Michael Vogt [Wed, 4 Dec 2013 07:41:23 +0000 (08:41 +0100)]
further refactor, extract GetReleaseForSourceRecord() out of FindSrc(), write out the selection notice to c1out to be consistent with the rest of the source
drop old /var/state to /var/lib transition artefacts
Regardless of when this transition was, it is so long ago that everyone
who would still need this has a million other problems to deal with now
so lets just drop this code.
generate apt-key script with vendor info about keys
The apt-key script uses quiet a few keyring files for operation which
are specific to the distribution it is build on and is hence one of the
most patched parts – even if it is not that often used anymore now that
a fragment directory for trusted.gpg exists.
With the net-update command a special keyring can be downloaded and
imported into apt, which must be signed by a master key. Its is
currently disabled because of security problems with it – and the only
known user before that was Ubuntu.
use a substvar to set the archive-keyring in debian/control
Adds a small helper to extract the small information bits we store in
apt-vendor.ent and uses it in debian/rules to set apt:keyring as a
substvar for debian/control populated with the &keyring-package; info
add a vendor specific file to have configurable entities
manpages sometimes refer to distro-specific things like the name of the
package providing the achive-keyring. Having a central place to
configure this helps in having it consistent in the manpages and allows
to load this info from other places in the buildsystem as well later.
Many derivatives make quiet a few simple changes to apt introducing
silly diffs just to change examples and co making it harder for
them to update apt and harder for us to merge real changes back.
It was enabled for a (long) while in Ubuntu, but it shouldn't hurt to
enable it in Debian as well – especially now that Debian has automatic
analyses of the buildlogs which don't work that well without the 'noise'
As testcases are running really fast it can happen that files which are
changed in reality are considered unchanged as the modify time isn't
changed. What we could do is disable those caches by default, but some
tests actually depend on those and deriving too much from the default by
default (pun intended) is not a good idea for tests after all.
webserver: use pthreads to handle multiple clients
Clients like browsers prefer to open many connections and keep them open
for a while, so that pages with lot of subelements would take a while to
load (if at all), by using threads as all servers do some way or another
we can resolve this. libapt is not intended to be pthread-safe and stuff
like the storage of the last return code doesn't make too much sense if
multiple clients interact with us, but it is good enough for now and an
other interesting (mis)use of libapt in general.
webserver: spurious newline after data confuses curl
Webserver wrongly sends an additional newline after the data which
causes curl to believe that the next request on this socket has no
header data and so includes all headers in the data output.
Calling truncate on /dev/null can happen by the download methods if they
are instructed to download a file to /dev/null (as testcases are only
interested in the status code, but do not support HEAD requests yet)
So just ignore truncate calls on the /dev/null file as it is always
empty anyway, so truncating to zero isn't a problem.