We uses a small trick to implement the fallback: We make it so, that
by-hash is a special compression algorithm and apt already knows how to
deal with fallback between compression algorithms.
The drawback with implementing this fallback is that a) we are guessing
again and more importantly b) by-hash is only tried for the first
compression algorithm we want to acquire, not for all as before – but
flipping between by-hash and well-known for each compression algorithm
seems to be not really worth it as it seems unlikely that there will
actually be mirrors who only mirror a subset of compressioned files, but
have by-hash enabled.
The user-experience is the usual fallback one: You see "Ign" lines in
the apt update output. The fallback is implemented as a transition
feature, so a (potentially huge) mirror network doesn't need a flagday.
It is not meant as a "someday we might" or "we don't, but some of our
mirrors might" option – we want to cut down on the 'Ign' lines front so
that they become meaningful – if we wanted to spam everyone with them, we
could enable by-hash by default for all repositories…
sources.list and config options are better suited for this.
add by-hash sources.list option and document all of by-hash
This changes the semantics of the option (which is renamed too) to be a
yes/no value with the special additional value "force" as this allows
by-hash to be disabled even if the repository indicates it would be
supported and is more in line with our other yes/no options like pdiff
which disable themselves if no support can be detected.
The feature wasn't documented so far and hasn't reached a (un)stable
release yet, so changing it without trying too hard to keep
compatibility seems okay.
Not all tests work yet, most notable the cdrom tests, but those require
changes in libapt itself to have a proper fix and what we have fixed so
far is good enough progress for now.
deal with spaces in path, command and filepaths in apt-key
Filenames we get could include spaces, but also the tmpdir we work in
and the failures we print in return a very generic and unhelpful…
Properly supporting spaces is a bit painful as we constructed gpg
command before, which is now moved to (multilevel) calls to temporary
scripts instead.
This is mostly a small speedup for the testcases, but it is also handy
to document which tests actually deal with a specific hash compared to
those which 'just' need some hash which can be important while adding
new hashes.
do not report deprecate warnings for the None declaration
This is defined for compatibility, warning about it is intended, but
only in places where it is actually used, rather than at the place we
declare it for compatability…
Setting CXXFLAGS like --coverage on the commandline fails if we set the
std too late, so if we set it with the compiler name we set it always
first. A bit hacky as it bends the expectation, but seems to work.
tests: use more 'native' instead of 'amd64' if possible
The tests usually run on amd64 boxes, but once in a while I run it on a
(slow) armel box as well, which has its fair share of problems with some
tests, but at least the low hanging fruits can be dealt with: Do not
assume that amd64 is the native dpkg architecture – instead use whatever
dpkg thinks is native as architecture for the test.
avoid using global PendingError to avoid failing too often too soon
Our error reporting is historically grown into some kind of mess.
A while ago I implemented stacking for the global error which is used in
this commit now to wrap calls to functions which do not report (all)
errors via return, so that only failures in those calls cause a failure
to propergate down the chain rather than failing if anything
(potentially totally unrelated) has failed at some point in the past.
This way we can avoid stopping the entire acquire process just because a
single source produced an error for example. It also means that after
the acquire process the cache is generated – even if the acquire
process had failures – as we still have the old good data around we can and
should generate a cache for (again).
There are probably more instances of this hiding, but all these looked
like the easiest to work with and fix with reasonable (aka net-positive)
effects.
include debug information in the autoremove-kernels file
Figuring out after the fact what went wrong in the kernel hook is kinda
hart, also as the bugreports are usually very lacking on the details
front. Collecting the internal variables in the debug output we attach
to the generated file might help shine some light on the matter.
It's at least not going to hurt…
do not discard new manual-bits while applying EDSP solutions
In private-install.cc we call MarkInstall with FromUser=true, which sets
the bit accordingly, but while applying the EDSP solution we call mark
install on all packages with FromUser=false, so MarkInstall believes
this install is an automatic one and sets it to auto – so that a new package
which is explicitely installed via an external solver is marked as auto
and is hence also up for garbage collection in a following call.
Ideally MarkInstall wouldn't reset it, but the detection is hard to do
without regressing in other cases – and ideally ideally MarkInstall
wouldn't deal with the autobit at all – so we work around this on the
calling side for now.
implement autobit and pinning in EDSP solver 'apt'
The parser creates a preferences as well as an extended states file
based on the EDSP scenario file, which isn't the most efficient way of
dealing with this as thes text files have to be parsed again by another
layer of the code, but it needs the least changes and works good enough
for now. The 'apt' solver is in the end just a test solver like dump.
These assumptions were once true, but they aren't anymore, so what is
supposed to be a speed up is effectively a slowdown [not that it would
be noticible].
Usage of SingleArchFindPkg was nuked in a stable update already as the
included assumption was actually harmful btw, which is why we should get
right of other 'non-harmful' but still untrue assumptions while we can.
select kernels to protect from autoremove based on Debian version
This is basically a rewrite of the script with the general idea of
finding the Debian version of the installed kernels – as multiple
flavours will have the same Debian version – select the two newest of
them and translate them back to versions found in package names.
This way we avoid e.g. kernel and kernel-rt to use up the protected
slots even through they are basically the same kernel (just a different
flavour) so it is likely that if kernel doesn't work for some reason,
kernel-rt will not either.
This also deals with foreign kernel packages, kernels on hold and partly
installed kernels (in case multiple kernels are installed in the same
apt run) in a hopefully sensible way.
copy ReadWrite-error to the bottom to make clang happy
clang detects that fd isn't set in the ReadWrite case – just that this
is supposed to be catched earlier in this method already, but it doesn't
hurt to make it explicit here as well and clang is happy, too.
As said in the bugreport, this is hardly a serious problem on a security
front, but it was always on the list to have the filename configurable
somehow and the stable filename is a problem for parallel executions.
Using an environment variable (APT_EDSP_DUMP_FILENAME) for this is more
or less the best we can do here as solvers do not get told about our
configuration and such.
Pipes and such have no good Size value, but we still want to copy from
it maybe and we don't really need size as we can just as well read as
long as we get data out of a file to copy it.
The syntax of "Source" is different in EDSP compared to the the field of
the same name in 'the rest' of Debian, so documented this accordingly
and send the version as a new field.
implement dpkgs vision of interpreting pkg:<arch> dependencies
How the Multi-Arch field and pkg:<arch> dependencies interact was
discussed at DebConf15 in the "MultiArch BoF". dpkg and apt (among other
tools like dose) had a different interpretation in certain scenarios
which we resolved by agreeing on dpkg view – and this commit realizes
this agreement in code.
As was the case so far libapt sticks to the idea of trying to hide
MultiArch as much as possible from individual frontends and instead
translates it to good old SingleArch. There are certainly situations
which can be improved in frontends if they know that MultiArch is upon
them, but these are improvements – not necessary changes needed
to unbreak a frontend.
The implementation idea is simple: If we parse a dependency on foo:amd64
the dependency is formed on a package 'foo:amd64' of arch 'any'. This
package is provided by package 'foo' of arch 'amd64', but not by 'foo'
of arch 'i386'. Both of those foo packages provide each other through
(assuming foo is M-A:foreign) to allow a dependency on 'foo' to be
satisfied by either foo of amd64 or i386. Packages can also declare to
provide 'foo:amd64' which is translated to providing 'foo:amd64:any' as
well.
This indirection over provides was chosen as the alternative would be to
teach dependency resolvers how to deal with architecture specific
dependencies – which violates the design idea of avoiding resolver
changes, especially as architecture-specific dependencies are a
cornercase with quite a few subtil rules. Handling it all over versioned
provides as we already did for M-A in general seems much simpler as it
just works for them.
This switch to :any has actually a "surprising" benefit as well: Even
frontends showing a package name via .Name() [which doesn't show the
architecture] will display the "architecture" for dependencies in which
it was explicitely requested, while we will not show the 'strange' :any
arch in FullName(true) [= pretty-print] either. Before you had to
specialcase these and by default you wouldn't get these details shown.
The only identifiable disadvantage is that this complicates error
reporting and handling. apt-get's ShowBroken has existing problems with
virtual packages [it just shows the name without any reason], so that
has to be worked on eventually. The other case is that detecting if a
package is completely unknown or if it was at least referenced somewhere
needs to acount for this "split" – not that it makes a practical
difference which error is shown… but its one of the improvements
possible.
M-A: allowed pkgs of unconfigured archs do not statisfy :any
We parse all architectures we encounter recently, which means we also
parse packages from architectures which are neither native nor foreign,
but still came onto the system somehow (usually via heavy force).
store ':any' pseudo-packages with 'any' as architecture
Previously we had python:any:amd64, python:any:i386, … in the cache and
the dependencies of an amd64 package would be on python:any:amd64, of an
i386 on python:any:i386 and so on. That seems like a relatively
pointless endeavor given that they will all be provided by the same
packages and therefore also a waste of space.
Michael Vogt [Fri, 4 Sep 2015 21:29:38 +0000 (23:29 +0200)]
Add support for writing by-hash dirs in apt-ftparchive
This option is enabled via the APT::FTPArchive::DoByHash switch.
It will also honor the option APT::FTPArchive::By-Hash-Keep that
controls how many previous generation of by-hash files should be
kept (defaults to 3).
Merged from https://github.com/mvo5/apt/tree/feature/apt-ftparchive-by-hash
Initializing a random number generator with the time since epoch could
be good enough, but reaches its limits in test code as the 100
iterations might very well happen in the same second and hence the seed
number is always the same… clock() has a way lower resolution so it
changes more often and not unimportant: If many users start the update
at the same time it isn't to unlikely the SRV record will be ordered in
the same second choosing the same for them all, but it seems less likely
that the exact same clock() time has passed for them.
And if I have to touch this, lets change a few other things as well to
make me and/or compilers a bit happier (clang complained about the usage
of a GNU extension in the testcase for example).
use unusable-for-security hashes for integrity checks
We want to declare some hashes as not enough for security, so that a
user will need --allow-unauthenticated or similar to get data secured
only by those hashes, but we can still us these hashes for integrity
checks if we got them.
test: show the highlevel test for lowerranking ones
testsuccess checks the return code, but it does also some autotests
based on the command like grepping for dpkg warnings in a apt-get
install call – but if this finds something it is just showing the grep
command. With this change it will additionally show the first msgtest
which in this case will detail the actual apt-get install call.
tests: store msgtest in -q mode for display in msgfail
Not-quiet output is very verbose and with our growing array of tests
generates many many lines which e.g. kills the log display in travis-ci
and obscures failures and uncatched output in a wall of details.
The -q mode fixed this by callapsing passed tests to a single P and now
with some rework we can even get failures properly displayed with the
message from msgtest.
do delay the test for http, too, to make it more reliable
The file method was already slowed down and somehow I thought I had done
the same for http, but it turns out that I didn't. Giving it the same
delay as file should help in making this test slower and therefore more
likely to successfully test what it is supposed to test.
if file is inaccessible for _apt, disable privilege drop in acquire
We had a very similar method previously for our own private usage, but
with some generalisation we can move this check into the acquire system
proper so that all frontends profit from this compatibility change.
As we are disabling a security feature here a warning is issued and
frontends are advised to consider reworking their download logic if
possible.
Note that this is implemented as an all or nothing situation: We can't
just (not) drop privileges for a subset of the files in a fetcher, so in
case you have to download some files with and some without you need to
use two fetchers.
ignore for _apt inaccessible TMPDIR in pkgAcqChangelog
Using libpam-tmpdir caused us to create our download tmp directory in
root's private tmp before changing to _apt, which wouldn't have access
to it.
By extending our GetTempDir method with an optional wrapper changing the
effective user, we can test if a given user can access the directory and
ignore TMPDIR if not instead of ignoring TMPDIR completely.
Multiple targets downloading the same file is bad™ as it leads us to all
sorts of problems like the acquire system breaking or simply a problem
of which settings to use for them. Beside that this is most likely a
mistake and silently ignoring it doesn't help the user realizing his
mistake…
On the other hand, we have 'duplicates' which are 'created' by how we
create indextargets, so we have to prevent those from being created to
but do not emit a warning for them as this is an implementation detail.
And then, there is the absolute and most likely user mistake: Having the
same target(s) activated in multiple entries.
xz has pretty much won "the compressor war" and e.g. the Debian archive
doesn't even distribute bz2 anymore in favor of 'xz' and 'gz', so by
changing the default order we have a more realistic --print-uris
behavior as it will always show the first compressor.
In practice this effects repositories without a Release file (very bad,
we don't want to support them anymore anyhow) as xz will be tried before
bz2 now [which is probably not available, but so might be bz2…] AND
repositories which provide both, bz2 and xz (which isn't too common) in
sofar as apt will now download xz instead of bz2.
Users with special needs can stick with bz2 as first compressor tried
with Acquire::CompressionTypes::Order:: "bz2"; (see man apt.conf) – but
users with special needs usually prefer "gz" anyhow, so the realworld
change is expected to be very low.
Some targets like Contents-udeb are special-needs targets. Shipping the
configuration snippet for them is okay, but they shouldn't be downloaded
by default. Forcing the user to enable targets by uncommenting targets
is wrong and this would still not really solve the problem completely as
even if you want to download some -udebs it will probably not be for all
sources you have enabled, so having the possibility of disabling a
target by default, but giving the user the option to enable it on a
per-source entry basis is better.
use c++11 algorithms to avoid strange compiler warnings
Nobody knows what makes the 'unable to optimize loop' warning to appear
in the sourceslist minus-options parsing, especially if we use a foreach
loop, but we can replace it with some nice c++11 algorithm+lambda usage,
which also helps in making even clearer what happens here.
And as this would be a lonely change, lets do it for a few more loops as
well where I might or might not have seen the warning at some point in
time, too.
Some additional files like 'Contents' are very big and should therefore
kept compressed on the disk, which apt-file did in the past. It also
implemented pdiff patching of these files by un- and recompressing these
files on-the-fly, with this commit we can do the same – but we can do
this in both pdiff patching styles (client and server merging) and
secured by hashes.
Hashes are in so far slightly complicated as we can't compare the hashes
of the compressed files as we might compress them differently than the
server would (different compressor versions, options, …), so we must
compare the hashes of the uncompressed content.
While this commit has changes in public headers, the classes it changes
are marked as hidden, so nobody can use them directly, which means the
ABI break is internal only.
auto-prefix $(SITE) for indextargets Description field
This updates the documentation for a change which actually happened in c2a4a8dded2dfb56dbcab9689b6cb4b96c9999b6 already. The acquire system
expects the $(SITE) to be there (e.g. for mirror rewriting) so we are
better of prefixing it automatically than giving frontends the chance to
forget it. There is no point in not showing $(SITE) first anyway.
Disabling pdiffs can be useful occasionally, like if you have a fast
local mirror where the download doesn't matter, but still want to use it
for non-local mirrors. Also, some users might prefer it to only use it
for very big indextargets like Contents.
This could allow an attacker to mark a package as installed in a
remote package index, as long as the package was not listed in
the dpkg status file.
This way, an attacker could force the installation of a package
during a dist-upgrade, by providing two packages in an index,
an older marked as installed, and a newer - apt would "upgrade"
to the newer version.
We dup() the file descriptor when opening compressed files, so we
always need to close the dup()ed one. Furthermore, not unsetting
the d-pointer causes issues when running OpenDescriptor() multiple
times on the same file descriptor.
cacheset: Prefer the depcache over the policy again
By preferring the policy over the depcache, we ignore any changes
we made in the depcache, which makes it impossible for code to
change the candidate used here.
allow explicit dis/enable of IndexTargets in sources options
While Target{,-Add,-Remove} is available for configuring IndexTargets
already, allow Targets to be mentioned explicitely as yes/no options as
well, so that the Target 'Contents' can be disabled via 'Contents: no'
as well as 'Target-Remove: Contents'.
use always priv-dropping for changelog download as root
First of, the temporary directory we download the changelog to needs to
be owned by _apt, but that also means that we don't need to check if we
could/should drop privs as the download happens to a dedicated tempdir
and only after that it is moved to its final location by a privileged user.
For many usecases like the acquire system libapt-pkg actually needs
tools and config found in the apt package. apt tends to be installed
everywhere libapt-pkg appears usually anyhow, but just in case to nudge
users (and tools) in the right direction.
Note that this isn't and shouldn't be a hard depends as there are
usecases working perfectly without 'apt' and as this is such an esoteric
problem incurring the costs arising from a Depends-Breaks-loop isn't
deemd as worth it.
The parameter name suggests that it should forbid the building of the
entire cache in memory, but this isn't how it was previously and as
AllowMem is false by default it actually prevents previous usecases from
working like being root and configuring apt to build no caches at all.
This should be fixed at some point to actually work, but that is hard to
pull off as it means switching the default and some callers (including
apt itself) actually did call it explicitly with false in certain
cases for no apparent reason (at least now where it is common to have
enough memory to throw at every problem and even if not is a slow apt
usally better than an apt erroring out).
Fetched() was reported for mostly nothing, while we should be calling it
for files worked with from non-local sources (e.g. http, but not file or
xz). Previously this was called from an acquire item, but got moved to
the acquire worker instead to avoid having it (re)implemented in all
items, but the checks were faulty.
We deal with Conflicts in SmartUnpack in pretty much the same way, but
Breaks weren't handled in SmartConfigure so that the remove was sheduled
after the configuration of the package breaking the to-be-removed.
After fixing Bug#796999, we noticed that there were
some more instances of iterators which had no associated
Dynamic object, causing them to not be updated when
the cache was remapped.
This happened in two places: In NewPackage() and in
NewProvidesAllArch().
pkgcachegen: Account for remapping when parsing depends from NewPackage
In both the Ver and Dep variables, we need to account for remapping,
as otherwise we would still reference the old bug.
Reproduction environment:
* An i386 system with amd64 foreign architecture
* A sources.list with
deb http://snapshot.debian.org/archive/debian/20150826T102846Z/ unstable main
deb http://snapshot.debian.org/archive/debian/20150826T102846Z/ experimental main
Thanks: Jakub Wilk for the bug report and the backtraces Closes: #796999