A lot of code deals with iterating over packages and checking for
specific states. At the moment these are all handcrafted inplace, but
that makes sharing common code which just differs in the states it
checks rather difficult and is error prune. Having an API to construct
arbitrary complex filters will come in handy for those.
The methods itself deal with the helper a lot, so it makes sense to move
them to the helper itself, which helps also if we want to override some
of these methods, the FromString mentioned in the bugreport being the
obvious example.
VCI is spared from this change for now as while it would fit with the
same reasoning it much heavier entangled with the previous
CacheSetHelper change, so moving it now would mean breaking the API.
The PCI change is worthwhile on its own though as it is used by VCI.
The introduction of Fnmatch showed that each new selector would require
multiple new virtual methods in the CacheSetHelper to work correctly,
which isn't that great. We now flip to a single virtual method which
handles all cases separated by an enum – as new enum values can be added
without an ABI break.
Great care was taken to make old code work with the new way of organisation,
which means in return that you might be bombarded with deprecation
warnings now if you don't adapt, but code should still compile and work
as before as can be seen in apt itself with this commit.
The comment above their definition marks them already as such, so this
is only a formalisation of the deprecation and fixes the occurances we
have in our own code together with removing a magic number.
fix progress output for (dist-)upgrade calculation
Previously, we had a start and a done of the calculation printed by
higher-level code, but this got intermixed by progress reporting from an
external solver or the output of autoremove code…
The higherlevel code is now only responsible for instantiating a
progress object of its choosing (if it wants progress after all) and the
rest will be handled by the upgrade code. Either it is used to show the
progress of the external solver or the internal solver will give some
hints about its overall progress. The later isn't really a proper
progress as it will jump forward after each substep, but that is at
least a bit better than before without any progress indication.
Fixes also the 'strange' non-display of this progress line in -q=1, while
all others are shown, which is reflected by all testcase changes.
Turns out that version numbers aren't as random as you might guess.
In my cache for example, I have:
Total package names: 69513 (1390 k)
Total package structures: 188259 (9036 k)
Total distinct versions: 186345 (13.4 M)
Total dependencies: 2052242 (57.5 M)
which amounts to 1035873 (10,1 M) strings.
Reusing version strings reduces this to 161465 (3.479 k).
This comes at a cost of course: Generation is slightly slower, but we
are still faster than what we started with and it makes room (also cache
size wise) for further changes.
drop stored StringItems in favor of in-memory mappings
Strings like Section names or architectures are needed vary often.
Instead of writing them each time we need them, we deploy sharing for
these special strings. Until now, this was done with a linked list of
strings in which we would search, which was stored in the cache.
It turns out we can do this just as well in memory as well with a bunch
of std::map's.
In memory means here that it isn't available anymore if we have a partly
invalid cache, but that isn't much of a problem in practice as the
status file is compared to the other files we parse very small and includes
mostly duplicates, so the space we would gain by storing is more or less
equal to the size of the stored linked list…
So far, only the few strings stored in stringitems were counted, but
many more strings are directly inserted into the cache. We account for
this now by identifying all these different strings and measure their
length. We are still not at the correct size of the cache in 'stats'
this way, but we are now again a bit closer.
packages in the cache are sorted by name so noise-free
Commit aa0fe657e46b87cc692895a36df12e8b74bb27bb sorts the package names
in the hashtable. We make use of this already in these functions, but as
a minor sideeffect it also means that we don't have 'noise' anymore
between packages belonging to the same group. We therefore don't need to
check for a matching name in Grp.FindPkg anymore.
Package names have to be lowercase (debian-policy §5.6.1) and in as
lowlevel as these method are it would be quiet strange to treat an
invalid package "suddently" as a valid one which other tools might or
might not accept. If case-insensitivity is really needed the frontend
should ensure this rather than these methods waste cpu cycles by
default.
They both store the same information, so this field just takes up space
in the Package struct for no good reason. We mark it "just" as deprecated
instead of instantly removing it though as it isn't misleading like
Section was and is potentially used in the wild more often.
Michael Vogt [Fri, 26 Sep 2014 18:59:31 +0000 (20:59 +0200)]
Do not allow going from authenticated to unauthenticated repo
Also rework the way we load the Release file, so it only after
Release.gpg verified the Release file. The rational is that we
never want to load untrusted data into our parsers. Only stuff
verified with gpg or by its hashes get loaded. To load untrusted
data you now need to use apt-get update --allow-unauthenticated.
Michael Vogt [Fri, 26 Sep 2014 16:13:48 +0000 (18:13 +0200)]
Do not download Packages/Sources files on I-M-S hit of the Release file
With this branch we know that the data in the lists directory is always
what the release file says, so if the Release file is unchanged, then
there is no need to queue the download of the other indexfiles as they
will be unchanged too (or broken :)
Michael Vogt [Thu, 25 Sep 2014 09:49:16 +0000 (11:49 +0200)]
Revert making pkgAcquire::Item::DescURI() "const"
Revert because its a API change and the gain does not justify the
extra work to make the required changes in the consumers of this
interface at this point.
Michael Vogt [Wed, 24 Sep 2014 14:22:05 +0000 (16:22 +0200)]
Drop Privileges to "Debian-apt" in most acquire methods
Add a new "Debian-apt" user that owns the /var/lib/apt/lists
and /var/cache/apt/archive directories. The methods
http, https, ftp, gpgv, gzip switch to this user when they
start.
Thanks to Julian and "ioerror" and tors "switch_id()" code.
Michael Vogt [Sun, 21 Sep 2014 19:40:10 +0000 (21:40 +0200)]
Ensure that iTFRewritePackageOrder is "MD5sum" to match apt-ftparchive
The iTFRewritePackageOrder is used in indexcopy to copy and normalize
cdrom Packages files. This change will ensure that there is no
"normalization" that changes MD5sum -> MD5Sum which alters the hash
of the Packages file on disk (oh the irony).
Michael Vogt [Sun, 21 Sep 2014 19:23:04 +0000 (21:23 +0200)]
Fix regression for cdrom: sources from latest security update
Skip a reverify for cdrom: sources. The reverify step is actually
harmful here because the apt-cdrom add code uses the indexcopy.cc
which will "normalize" the Packages file from the cdrom when it
writes it to the local disk. This leads to changing the "MD5sum"
field (notice the lower case "s") on the cdrom Packages file to
a "MD5Sum" field on the local file in /var/lib/apt/lists. Which
of course alters the hash and makes apt fail to reverify the file.
Michael Vogt [Fri, 19 Sep 2014 14:41:55 +0000 (16:41 +0200)]
Fix regression when copy: is used for a relative path
When we do a ReverifyAfterIMS() we use the copy: method to
verify the hashes again. If the user uses -o Dir=./something/relative
this fails because we use the URI class in copy.cc that strips
away the leading relative part. By not using URI this is fixed.
Michael Vogt [Wed, 17 Sep 2014 12:57:05 +0000 (14:57 +0200)]
Fix regression for file:/// uris from CVE-2014-0487
Do not run ReverifyAfterIMS() for local file URIs as this will
causes apt to mess around in the file:/// uri space. This is
wrong in itself, but it will also cause a incorrect verification
failure when the archive and the lists directory are on different
partitions as rename().
Michael Vogt [Tue, 16 Sep 2014 18:23:43 +0000 (20:23 +0200)]
SECURITY UPDATE for CVE-2014-{0488,0487,0489}
incorrect invalidating of unauthenticated data (CVE-2014-0488)
incorect verification of 304 reply (CVE-2014-0487)
incorrect verification of Acquire::Gzip indexes (CVE-2014-0489)
Builds, runs and generates everything needed to have a coverage report
at the end for apt. The report isn't perfect as most childs apt forks do
not have a regular exit and so data is never written for them, which
results in e.g. most methods to have zero coverage reported.
Most pagers are nice and default to running non-interactively if they
aren't connected to a terminal and we relied on that. On ci.debian.net
the configured pager is printing a header out of nowhere though, so if
we are printing to a non-terminal we call "cat" instead.
In the rework we also "remove" the dependency on sensible-utils in sofar
as we call some alternatives if calling the utils fail.
This seems to be the last problem preventing a "PASS" status on
ci.debian.net, so we close the associated bugreport.
rework PTY magic to fix stair-stepping on kfreebsd
A pty slave we have got from openpty can only be used for one dpkg
child, if we give it to a second child on kfreebsd setting TIOCSCTTY
fails causing the output to be stair-stepped from now on.
By switching the code to creating a master and opening a new slave in
the child for each child we can fix this glitch, so that at least the
master remains stable.
APT treats upgrades like installs and dpkg is very similar in this, but
prints still a slightly different processing message indicating that it
is really an upgrade which we hadn't parsed so far, but this wasn't
really visible as we quickly moved on to a 'known' state.
More problematic was the reinstall case as apt hadn't recognized this
for the package name detection, so that reinstalls had no progress since
we introduced MultiArch.
Commit cbcdd3ee9d86379d1b3a44e41ae8b17dc23111d0 removes the space at the
end of the debfile name dpkg send to us and we previously had included
in the pmerror message we printed on the statusfd.
Instead of trying to inspect /proc and the fds inside we use "test -t 1"
instead as this is available and working on kfreebsd as well – not that
something breaks if we wouldn't, but we like color.
Using 'kfreebsd' here makes the test fail on a kfreebsd system
(obviously), so we just use something totally madeup in the hope that
this is less like to conflict in the future.
No reason in and of by itself at the moment, but prepares for the goal
of having 'apt search' and 'apt-cache search' using the same code now
that they at least support the same stuff. The 'apt' code is just a
multitude slower at the moment…
The method already deals with a format string, but had an else path
doing a hardcoded format as well. This is changed now to use the same
code for both - the format in the second case is still fixed though.
Michael Vogt [Fri, 5 Sep 2014 10:50:15 +0000 (12:50 +0200)]
Ensure we have a Policy in CacheFile.BuildDepCache()
This partly reverts d059cc2 and fixes bug #753297 in a more
general way by ensuring that CacheFile.BuildDepCache() builds
a pkgPolicy if there isn't one already.
Michael Vogt [Fri, 5 Sep 2014 10:03:28 +0000 (12:03 +0200)]
Fix incorrect upgradable listing in "apt list" (thanks to Michael Musenbrock)
The "apt list" command was using only the pkgDepCache but not the
pkgPolicy to figure out if a package is upgradable. This lead to
incorrect display of upgradable package when the user used the
policy to pin-down packages. Thanks to Michael Musenbrock for the
initial patch.
Make Packages & Sources generation optional, during Generate call
refactor a bit, extract code out of Generate() into
DoGenerate{PackagesAndSources,Contents}, add new
APT::FTPArchive::ContentsOnly option to allow skipping the generation
of Package/Source files (if they are generated e.g. by some db outside
of apt-ftparchives control)
Michael Vogt [Tue, 2 Sep 2014 15:06:52 +0000 (17:06 +0200)]
Use heap to allocate PatternMatch to avoid potential stack overflow
When apt-cache search with many args (> 130) is given the allocation
of PatternMatch on the stack may fail resulting in a segmentation
fault. By using the heap the max size is much bigger and we also
get a bad_alloc expection instead of a segfault (which we can catch
*if* this ever becomes a pratical problem). No test for the crash
as its not reproducable with the MALLOC_ settings in framework.
Michael Vogt [Tue, 2 Sep 2014 15:24:24 +0000 (17:24 +0200)]
* apt-pkg/deb/dpkgpm.cc:
- update string matching for dpkg I/O errors. (LP: #1363257)
- properly parse the dpkg status line so that package name is properly set
and an apport report is created. Thanks to Anders Kaseorg for the patch.
(LP: #1353171)