improve partial/ cleanup in abort and failure cases
Especially pdiff-enhanced downloads have the tendency to fail for
various reasons from which we can recover and even a successful download
used to leave the old unpatched index in partial/.
By adding a new method responsible for making the transaction of an
individual file happen we can at specialisations especially for abort
cases to deal with the cleanup.
This also helps in keeping the compressed indexes around if another
index failed instead of keeping the decompressed files, which we
wouldn't pick up in the next call.
remove "first package seen is native package" assumption
The fix for #777760 causes packages of foreign (and the native)
architectures, to be created correctly, but invalidates (like the
previously existing, but policy-forbidden architecture-less packages
we had to support for some upgrade scenarios) the assumption that the
first (and only) package in the cache for a single architecture system
must be the package for the native architecture (as, where should the
other architectures come from, right? Wrong.).
Depending on the order of parsing sources more or less packages can be
effected by this. The effects are strange (for apt it mostly effects
simulation/debug output, but also apt-mark on these specific packages),
which complicates debugging, but relatively harmless if understood as
most actions do not need direct named access to packages.
The problem is fixed by removing the single-arch special casing in the
paths who had them (Cache.FindPkg), so they use the same code as
multi-arch systems, which use them as a wrapper for Grp.FindPkg.
Note that single-arch system code was using Grp.FindPkg before as well
if a Grp structure was handily available, so we don't introduce new
untested code here: We remove more brittle special cases which are less
tested instead (this was planed to be done for Stretch anyhow).
Note further that the method with the assumption itself isn't fixed. As
it is a private method I opted for declaring it deprecated instead and
remove all its call positions. As it is private no-one can call this
method legally (thanks to how c++ works by default its still an exported
symbol through) and fixing it basically means reimplementing code we
already have in Grp.FindPkg.
Removing rather than fixing seems hence like a good solution.
a hit on Release files means the indexes will be hits too
If we get a IMSHit for the Transaction-Manager (= the InRelease file or
as its still supported fallback Release + Release.gpg combo) we can
assume that every file we would queue based on this manager, but already
have locally is current and hence would get an IMSHit, too. We therefore
save us and the server the trouble and skip the queuing in this case.
Beside speeding up repetative executions of 'apt-get update' this way we
also avoid hitting hashsum errors if the indexes are in fact already
updated, but the Release file isn't yet as it is the case on well
behaving mirrors as Release files is updated last.
The implementation is a bit harder than the theory makes it sound as we
still have to keep reverifying the Release files (e.g. to detect now expired
once to avoid an attacker being able to silently stale us) and have to
handle cases in which the Release file hits, but some indexes aren't
present (e.g. user added a new foreign architecture).
refactor calculation of final lists/ name from URI
Calculating the final name of an item which it will have after
everything is done and verified successfully is suprisingly complicated
as while they all follow a simple pattern, the URI and where it is
stored varies between the items.
With some (abibreaking) redesign we can abstract this similar to how it
is already down for the partial file location.
Checking Valid-Until on an unsigned Release file doesn't give us any
security brownie points as an attacker could just change the date and in
practice repositories with unsigned Release files will very likely not
have a Valid-Until date, but for symetry and the fact that being
unsigned is currently just a warning, while expired is a fatal error.
ensure lists/ files have correct permissions after apt-cdrom add
Its a bit unpredictable which permissons and owners we will encounter on
a CD-ROM (or a USB stick, as apt-cdrom is responsible for those too),
so we have to ensure in this codepath as well that everything is nicely
setup without waiting for a 'apt-get update' to fix up the (potential)
mess.
We do this in HTTP already to give the CPU some exercise while the disk
is heavily spinning (or flashing?) to store the data avoiding the need
to reread the entire file again later on to calculate the hashes – which
happens outside of the eyes of progress reporting, so you might ended up
with a bunch of https workers 'stuck' at 100% while they were busy
calculating hashes.
This is a bummer for everyone using apt as a connection speedtest as the
https method works slower now (not really, it just isn't reporting done
too early anymore).
Methods get told which hashes are expected by the acquire system, which
means we can use this list to restrict what we calculate in the methods
as any extra we are calculating is wasted effort as we can't compare it
with anything anyway.
Adding support for a new hash algorithm is therefore 'free' now and if a
algorithm is no longer provided in a repository for a file, we
automatically stop calculating it.
In practice this results in a speed-up in Debian as we don't have SHA512
here (so far), so we practically stop calculating it.
Servers who advertise that they close the connection get the 'Closes'
encoding flag, but this conflicts with servers who response with a
transfer-encoding (e.g. encoding) as it is saved in the same flag.
We have a better flag for the keep-alive (or not) of the connection
anyway, so we check this instead of the encoding.
This is in practice not much of a problem as real servers we talk to are
HTTP1.1 servers (with keep-alive) and there isn't much point in doing
chunked encoding if you are going to close anyway, but our simple
testserver stumbles over this if pressed and its a bit cleaner, too.
send Alt-* info for uncompressed based on any compressions
file sends information about the uncompressed file if it can find it as
well as for the compressed file. This was done only for gzip so far, but
we support more compression types. That this information isn't used a
lot is a different story.
The worker expects that the methods tell him when they start or finish
downloading a file. Various information pieces are passed along in this
report including the (expected) filesize. https was using a "global"
struct for reporting which made it 'reuse' incorrect values in some
cases like a non-existent InRelease fallbacking to Release{,.gpg}
resulting in a size-mismatch warning. Reducing the scope and redesigning
the setting of the values we can fix this and related issues.
Closes: 777565, 781509
Thanks: Robert Edmonds and Anders Kaseorg for initial patchs
We just need it for unit tests and our debian/rules file actually skips
calling them if nocheck is given… but this fails anyhow as we declared a
hard-dependency on it. Demoting the error to a warning in configuration
and adding a test in the 'make test' path with a friendly message allows
nocheck to be useful again.
(Running unit tests is fully encouraged of course, but bootstrappers and
co do not need to be burdened with this stuff)
parse specific-arch dependencies correctly on single-arch systems
On single-arch the parsing was creating groupnames like 'apt:amd64' even
through it should be 'apt' and a package in it belonging to architecture
amd64. The result for foreign architectures was as expected: The
dependency isn't satisfiable, but for native architecture it means the
wrong package (ala apt:amd64:amd64) is linked so this is also not
satisfiable, which is very much not expected.
No longer excluding single-arch from this codepath allows the generation
of the correct links, which still link to non-exisiting packages for
foreign dependencies, but natives link to the expected native package
just as if no architecture was given.
For negative arch-specific dependencies ala Conflicts this matter was
worse as apt will believe there isn't a Conflict to resolve, tricking it
into calculating a solution dpkg will refuse.
Architecture specific positive dependencies are rare in jessie – the
only one in amd64 main is foreign –, negative dependencies do not even
exist. Neither class has a native specimen, so no package in jessie is
effected by this bug, but it might be interesting for stretch upgrades.
This also means the regression potential is very low.
In #780028 we were discussing how the or-group order should be more
important than keep-back decisions of 'upgrade'. We have this behaviour,
but to ensure it stays this way lets add a test for it.
Working with strings c-style is complicated and error-prune,
so by converting to c++ style we gain some simplicity and
avoid buffer overflows by later extensions.
keyids in "apt-key del" should be case-insensitive
gnupg is case-insensitive about keyids, so back then apt-key called it
directly any keyid was accepted, but now that we work more with the
keyid ourself we regressed to require uppercase keyids by accident.
This is also inconsistent with other apt-key commands which still use
gnupg directly. A single case-insensitive grep and we are fine again.
demote VectorizeString gcc attribute from const to pure
g++-5 generates a slightly broken libapt which doesn't split
architecture configurations correctly resulting in e.g. Packages files
requested for the bogus architecture 'amd64,i386' instead of for amd64
and i386.
The reason is an incorrectly applied attribute marking the function as
const, while functions with pointer arguments are not allowed to be
declared as such (note that char& is a char* in disguise). Demoting the
attribute to pure fixes this issue – better would be dropping the & from
char but that is an API change…
Neither earlier g++ versions nor clang use this attribute to generate
broken code, so we don't need a rebuild of dependencies or anything and
g++-5 isn't even included in jessie, but the effect is so strange and
apt popular enough to consider avoiding this problem anyhow.
Michael Vogt [Tue, 7 Apr 2015 10:20:56 +0000 (12:20 +0200)]
fix crash in order writing in pkgDPkgPM::WriteApportReport()
libapt can be configured to write various bits of information to a file
creating a report via apport. This is disabled by default in Debian and
apport residing only in /experimental so far, but Ubuntu and other
derivatives have this (in some versions) enabled by default and there is
no regression potentially here.
The crash is caused by a mismatch of operations vs. strings for
operations, so adding the missing strings for these operations solves
the problem.
avoid depends on std::string implementation for pkgAcquire::Item::Mode
In /experimental this is resolved by deprecating Mode and moving to a
new std::string, but that breaks ABI of course, so that was out of
question. We can't change to a malloc/free style c-string either as
Mode is public and hence a library user could be setting this as well.
std::string implementors actually helped us out here with copy-on-write
which means that while the variable "obviously" runs out of scope here,
in reality you get the correct result as the string we work with here
comes from the configuration in which it is still valid. Such a
dependency on magic is bad of course, but its still interesting that
only python3 seems to have an issue with it…
With some silly explicit if-else assigning we can sidestep this issue
while retaining the same output for 99.99% of all users (= noone
actually configures additional compression algorithms which are also
provided by repositories…), but even for these 0.01% its just a small
change in the display as Mode can not be used for anything else.
Example: apt/aptitude uses it in its 'update' implementations in the
one-line progress at the bottom for specific items.
The worker expects that the methods tell him when they start or finish
downloading a file. Various information pieces are passed along in this report
including the (expected) filesize. https is using a "global" struct for
reporting which made it 'reuse' incorrect values in some cases like a
non-existent InRelease fallbacking to Release{,.gpg} resulting in an incorrect
size-mismatch warning scaring and desensitizing users as well as being subject
to a race between the write_data and progress callbacks generating incorrect
progress reporting and potentially the same error message.
Other branches as well as the bugreports contain 'better' fixes making the
struct local and other sensible changes, but are larger as a result, so in
this version we opted for short diff with minimal effect above else instead.
Closes: 777565, 781509
Thanks: Robert Edmonds and Anders Kaseorg for initial patchs
Jérémy Bobbio [Tue, 10 Mar 2015 09:09:44 +0000 (10:09 +0100)]
stop displaying time of build in online help
As part of the “reproducible builds” effort [1], we have noticed that
apt could not be built reproducibly.
One issue is that it uses the __DATE__ and __TIME__ macros of the C
preprocessor to display the time of build in the online help. We believe
this information not to be really useful to users as they can always
look at the package data and metadata to figure it out.
The attached patch simply removes this information. All
non-documentation packages can then be built reproducibly with our
current experimental framework.
[David: changed the string slightly to be untranslateable as well]
We use test{success,failure} now all over the place in the framework, so
its only consequencial to do this in the situations in which we test for
a specific output as well.
Helmut Grohne [Mon, 9 Mar 2015 17:11:10 +0000 (18:11 +0100)]
parse arch-qualified Provides correctly
The underlying problem is that libapt-pkg does not correctly parse these
provides. Internally, it creates a version named "baz:i386" with
architecture amd64. Of course, such a package name is invalid and thus
this version is completely inaccessible. Thus, this bug should not cause
apt to accept a broken situation as valid. Nevertheless, it prevents
using architecture qualified depends.
This way the 'correct' version is carried over into the po files to
reflect which version they were built for rather than the version before
the current one.
(error) Same iterator is used with different containers
cppcheck reports this error, its not really a problem for us as the API
can actually deal with it via implicit conversion, but being explicit
can't hurt and the less reported errors the better.
properly implement pkgRecord::Parser for *.deb files
Implementing FileName() works for most cases for us, but other
frontends might need more and even for us its not very stable as
the normal Jump() implementation is pretty bad on a deb file and
produce errors on its own at times.
So, replacing this makeshift with a complete implementation by
mostly just shuffling code around.
Bug #778375 uncovered that https wasn't properly integrated in the class
family tree of http as it was supposed to be leading to a NULL pointer
dereference. Fixing this 'properly' was deemed to much diff for
practically no gain that late in the release, so commit 0c2dc43d4fe1d026650b5e2920a021557f9534a6 just fixed the synptom, while
this commit here is fixing the cause plus adding a test.
The implementation was needlessly complex and flawed through preventing
positive dependencies from gaining points like they did before these
commits making library transitions harder instead of simpler. It worked
out anyhow most of the time out of pure 'luck' (and other ways of
gaining points) or got miss attributed to being a temporary hick-up.
Your mileage may vary, but don't worry: There is more than one way to
do it, but our one size fits all is not a bigger hammer, but an entire
roundhouse kick! So brace yourself for the tl;dr: The limit is gone.*
Beware: This fixes also the problem that a double newline is
unconditionally added 'later' which is an overcommitment in case
the dsc filesize is limit-2 <= x <= limit.
* limited to numbers fitting into an unsigned long long.
Michael Vogt [Tue, 6 Jan 2015 09:54:24 +0000 (10:54 +0100)]
Add regression test for the previous commit
The issue was that https.cc never called URIStart(), one way to
detect this is that no download progress is generated without
this call. The test now checks for this and as a side-effect will
also ensure that we do not break download progress reporting and
Acquire::{http,https}::Dl-Limit accidently.
Michael Vogt [Mon, 5 Jan 2015 09:27:53 +0000 (10:27 +0100)]
Fix missing URIStart() for https downloads
Add a explicit ReceivedData to HttpsMethod that indicates when
we got data from the connection so that we can send URISTart()
to the parent.
This is needed because URIStart got moved in f9b4f12d from
the progress_callback to write_data() and it only checks for
Res.Size. In the old code if progress_callback is called by
libcurl (and sets Res.Size) before write_data is called then
URIStart() is never send. Making this a explicit ReceivedData
variable fixes this issue.
This results in an extra image package being removed from
APT::NeverAutoRemove, losing the intended effect of keeping the {current,
previous, latest} set of images installed.
Requiring a “.” in the package name tightens the matched package names
to those that are installing a specific version of the image, thus
eliding the meta-packages.
pass-through stdin fd instead of content if not a terminal
Commit 299aea924ccef428219ed6f1a026c122678429e6 fixes the problem of
not logging terminal in case stdin & stdout are not a terminal. The
problem is that we are then trying to pass-through stdin content by
reading from the apt-process stdin and writing it to the stdin of the
child (dpkg), which works great for users who can control themselves,
but pipes and co are a bit less forgiving causing us to pass everything
to the first child process, which if the sending part of the pipe is
e.g. 'yes' we will never see the end of it (as the pipe is full at some
point and further writing blocks).
There is a simple solution for that of course: If stdin isn't a terminal,
we us the apt-process stdin as stdin for the child directly (We don't do
this if it is a terminal to be able to save the typed input in the log).
always run 'dpkg --configure -a' at the end of our dpkg callings
dpkg checks now for dependencies before running triggers, so that
packages can now end up in trigger states (especially those we are not
touching at all with our calls) after apt is done running.
The solution to this is trivial: Just tell dpkg to configure everything
after we have (supposely) configured everything already. In the worst
case this means dpkg will have to run a bunch of triggers, usually it
will just do nothing though.
The code to make this happen was already available, so we just flip a
config option here to cause it to be run. This way we can keep
pretending that triggers are an implementation detail of dpkg.
--triggers-only would supposely work as well, but --configure is more
robust in regards to future changes to dpkg and something we will
hopefully make use of in future versions anyway (as it was planed at the
time this and related options were implemented).
Note that dpkg currently has a workaround implemented to allow upgrades
to jessie to be clean, so that the test works before and after. Also
note that test (compared to the one in the bug) drops the await test as
its is considered a loop by dpkg now.
If we have no controlling terminal opening a terminal will make this
terminal our controller, which is a serious problem if this happens to
be the pseudo terminal we created to run dpkg in as we will close this
terminal at the end hanging ourself up in the process…
The offending open is the one we do to have at least one slave fd open
all the time, but for good measure, we apply the flag also to the slave
fd opening in the child process as we set the controlling terminal
explicitely here.
This is a regression from 150bdc9ca5d656f9fba94d37c5f4f183b02bd746 with
the slight twist that this usecase was silently broken before in that it
wasn't logging the output in term.log (as a pseudo terminal wasn't
created).
Real webservers (like apache) actually send an error page with a 416
response, but our client didn't expect it leaving the page on the socket
to be parsed as response for the next request (http) or as file content
(https), which isn't what we want at all… Symptom is a "Bad header line"
as html usually doesn't parse that well to an http-header.
This manifests itself e.g. if we have a complete file (or larger) in
partial/ which isn't discarded by If-Range as the server doesn't support
it (or it is just newer, think: mirror rotation).
It is a sort-of regression of 78c72d0ce22e00b194251445aae306df357d5c1a,
which removed the filesize - 1 trick, but this had its own problems…
To properly test this our webserver gains the ability to reply with
transfer-encoding: chunked as most real webservers will use it to send
the dynamically generated error pages.
(The tests and their binary helpers had to be slightly modified to
apply, but the patch to fix the issue itself is unchanged.)
If we have no controlling terminal opening a terminal will make this
terminal our controller, which is a serious problem if this happens to
be the pseudo terminal we created to run dpkg in as we will close this
terminal at the end hanging ourself up in the process…
The offending open is the one we do to have at least one slave fd open
all the time, but for good measure, we apply the flag also to the slave
fd opening in the child process as we set the controlling terminal
explicitely here.
This is a regression from 150bdc9ca5d656f9fba94d37c5f4f183b02bd746 with
the slight twist that this usecase was silently broken before in that it
wasn't logging the output in term.log (as a pseudo terminal wasn't
created).
Real webservers (like apache) actually send an error page with a 416
response, but our client didn't expect it leaving the page on the socket
to be parsed as response for the next request (http) or as file content
(https), which isn't what we want at all… Symptom is a "Bad header line"
as html usually doesn't parse that well to an http-header.
This manifests itself e.g. if we have a complete file (or larger) in
partial/ which isn't discarded by If-Range as the server doesn't support
it (or it is just newer, think: mirror rotation).
It is a sort-of regression of 78c72d0ce22e00b194251445aae306df357d5c1a,
which removed the filesize - 1 trick, but this had its own problems…
To properly test this our webserver gains the ability to reply with
transfer-encoding: chunked as most real webservers will use it to send
the dynamically generated error pages.
always run 'dpkg --configure -a' at the end of our dpkg callings
dpkg checks now for dependencies before running triggers, so that
packages can now end up in trigger states (especially those we are not
touching at all with our calls) after apt is done running.
The solution to this is trivial: Just tell dpkg to configure everything
after we have (supposely) configured everything already. In the worst
case this means dpkg will have to run a bunch of triggers, usually it
will just do nothing though.
The code to make this happen was already available, so we just flip a
config option here to cause it to be run. This way we can keep
pretending that triggers are an implementation detail of dpkg.
--triggers-only would supposely work as well, but --configure is more
robust in regards to future changes to dpkg and something we will
hopefully make use of in future versions anyway (as it was planed at the
time this and related options were implemented).
correct architecture detection for 'rc' packages for purge
We were already considering these cases, but the code was flawed, so
that packages changing architectures are incorrectly handled and hence
the wrong architecture is used to call dpkg with, so that dpkg says the
package isn't installed (which it isn't for the requested architecture).
properly handle already reinstall pkgs in ordering
The bugreport itself describes the case of the ordering code detecting a
loop where none is present, but the testcase finds also cases in which
there is actually a loop and we fail to realize it. --reinstall can be
considered an interactive command through and it usually doesn't
encounter such "hard" problems (= looping essentials), so this is less
serious than it sounds at first.
James McCoy [Fri, 28 Nov 2014 13:21:06 +0000 (14:21 +0100)]
support long keyids in "apt-key del" instead of ignoring them
apt-key given a long keyid reports just "OK" all the time, but doesn't
delete the mentioned key as it doesn't find the key.
Note: In debian/experimental this was closed with 29f1b977100aeb6d6ebd38923eeb7a623e264ffe which just added the testcase
as the rewrite of apt-key had fixed this as well.
We run dpkg on its own pty, so we can log its output and have our own
output around it (like the progress bar), while also allowing debconf
and configfile prompts to happen.
In commit 223ae57d468fdcac451209a095047a07a5698212 we changed to
constantly reopening the slave for kfreebsd. This has the sideeffect
though that in some cases slave and master will lose their connection on
linux, so that no output is passed along anymore. We fix this by having
always an fd referencing the slave open (linux), but we don't use it
(kfreebsd).
Failing to get our PTY up and running has many (bad) consequences
including (not limited to, nor all at ones or in any case) garbled ouput,
no output, no logging, a (partial) mixture of the previous items, …
This commit is therefore also reshuffling quiet a bit of the creation
code to get especially the output part up and running on linux and the
logging for kfreebsd.
Note that the testcase tries to cover some cases, but this is an
interactivity issue so only interactive usage can really be a good test.
Only "recent" versions of dpkg support stdin for merge instead of a
file, so as a quick fix we delay calling it until we really need it
which fixes most of the problem already.
Checking for a specific dpkg version here is deemed too much work, just
like using a temporary file here and depends a too high requirement for
this minor usecase. After all, it didn't work at all before, so we break
nobody here and can fix it if someone complains (with a patch).
We run dpkg on its own pty, so we can log its output and have our own
output around it (like the progress bar), while also allowing debconf
and configfile prompts to happen.
In commit 223ae57d468fdcac451209a095047a07a5698212 we changed to
constantly reopening the slave for kfreebsd. This has the sideeffect
though that in some cases slave and master will lose their connection on
linux, so that no output is passed along anymore. We fix this by having
always an fd referencing the slave open (linux), but we don't use it
(kfreebsd).
Failing to get our PTY up and running has many (bad) consequences
including (not limited to, nor all at ones or in any case) garbled ouput,
no output, no logging, a (partial) mixture of the previous items, …
This commit is therefore also reshuffling quiet a bit of the creation
code to get especially the output part up and running on linux and the
logging for kfreebsd.
Note that the testcase tries to cover some cases, but this is an
interactivity issue so only interactive usage can really be a good test.
While on linux files are created in /tmp with $USER:$USER, on my
kfreebsd testmachine they are created with $USER:root, so we pull some
strings here to make it work on both.
create our cache and lib directory always with mode 755
We autocreate for a while now the last two directories in /var/lib/apt/lists
(similar for /var/cache/apt/archives) which is very nice for systems having
any of those on tmpfs or other non-persistent storage. This also means
though that this creation is effected by the default umask, so for
people with aggressive umasks like 027 the directories will be created
with 750, which means all non-root users are left out, which is usually
exactly what we want then this umask is set, but the cache and lib
directories contain public knowledge. There isn't any need to protect
them from viewers and they render apt completely useless if not
readable.
Usually they don't provide a lot in terms of what they test, but they
help in covering many lines from strictly anecdotal commands (stats,
moo) and error messages, so that stuff which really needs to be tested,
but isn't is better visible in coverage reports.
reenable support for -s (and co) in apt-get source
The conversion to accept only relevant options for commands has
forgotten another one, so adding it again even through the usecase might
very well be equally good served by --print-uris.
A version belongs to a section and has hence a section member of its
own. A package on the other hand can have multiple versions from
different sections. This was "solved" by using the section which was
parsed first as order of sources.list defines, but that is obviously a
horribly unpredictable thing.
Users are way better of with the Section() as returned by the version
they are dealing with. It is likely the same for all versions of a
package, but in the few cases it isn't, it is important (like packages
moving from main/* to contrib/* or into oldlibs …).
Backport of 7a66977 which actually instantly removes the member.
Collect all hashes we can get from the source record and put them into a
HashStringList so that 'apt-get source' can use it instead of using
always the MD5sum.
We therefore also deprecate the MD5 struct member in favor of the list.
While at it, the parsing of the Files is enhanced so that records which
miss "Files" (aka MD5 checksums) are still searched for other checksums
as they include just as much data, just not with a nice and catchy name.
This is a cherry-pick of 1262d35 with some dirty tricks to preserve ABI.
APT supports more than just one HashString and even allows to enforce
the usage of a specific hash. This class is intended to help with
storage and passing around of the HashStrings.
The cherry-pick here the un-const-ification of HashType() compared to f4c3850ea335545e297504941dc8c7a8f1c83358. The point of this commit is
adding infrastructure for the next one. All by itself, it just adds new
symbols.
Do the same with less code in apt-get. This especially ensures that the
lock file (and the parent directories) exist before we are trying to
lock. It also means that clean now creates the directories if they are
missing so we returned to a proper clean state now.