edsp: try to read responses even if writing failed
If a solver/planner exits before apt is done writing we will generate
write errors. Solvers like 'dump' can be pretty quick in failing but
produce a valid EDSP error report apt should read, parse and display
instead of just discarding even through we had write errors.
if the FileFd failed already following calls should fail, too
There is no point in trying to perform Write/Read on a FileFd which
already failed as they aren't going to work as expected, so we should
make sure that they fail early on and hard.
Simulations are frequently run by unprivileged users which naturally
don't have the permissions to write to the default location for the eipp
file. Either way is bad as running in simulation mode doesn't mean we
don't want to run the logging (as EIPP runs the same regardless of
simulation or 'real' run), but showing the warnings is relatively
pointless in the default setup, so, in case we would produce errors and
perform a simulation we will discard the warnings and carry on.
Running apt with an external planner wouldn't have generated these
messages btw.
If another file in the transaction fails and hence dooms the transaction
we can end in a situation in which a -patched file (= rred writes the
result of the patching to it) remains in the partial/ directory.
The next apt call will perform the rred patching again and write its
result again to the -patched file, but instead of starting with an empty
file as intended it will override the content previously in the file
which has the same result if the new content happens to be longer than
the old content, but if it isn't parts of the old content remain in the
file which will pass verification as the new content written to it
matches the hashes and if the entire transaction passes the file will be
moved the lists/ directory where it might or might not trigger errors
depending on if the old content which remained forms a valid file
together with the new content.
This has no real security implications as no untrusted data is involved:
The old content consists of a base file which passed verification and a
bunch of patches which all passed multiple verifications as well, so the
old content isn't controllable by an attacker and the new one isn't
either (as the new content alone passes verification). So the best an
attacker can do is letting the user run into the same issue as in the
report.
The rewrite in 742f67eaede80d2f9b3631d8697ebd63b8f95427 is based on the
assumption that the pipeline will always be at least one item short each
time it is called, but the logs in #832113 suggest that this isn't
always the case. I fail to see how at the moment, but the old
implementation had this behavior, so restoring it can't really hurt, can
it?
We read the entire input file we want to patch anyhow, so we can also
calculate the hash for that file and compare it with what he had
expected it to be.
Note that this isn't really a security improvement as a) the file we
patch is trusted & b) if the input is incorrect, the result will hardly be
matching, so this is just for failing slightly earlier with a more
relevant error message (althrough, in terms of rred its ignored and
complete download attempt instead).
The flush call is a no-op in most FileFd implementations so this isn't
as critical as it might sound as the only non-trivial implementation is
in the buffered writer, which tends not be used to buffer another
buffer…
APT doesn't know which packages will be triggered in the course of
actions, so it can't plan to see them for progress beforehand, but if it
sees that dpkg says that a package was triggered we can add additional
states. This is pretty much magic – after all it sets back the progress
– and there are cornercases in which this will result in incorrect
totals (package in partial states may or may not loose trigger states),
but the worst which can happen is that the progress is slightly
incorrect and doesn't reach 100%, but so be it. Better than being stuck
at 100% for a while as apt isn't realizing that a bunch of triggers
still need to be processed.
The progress reporting for a package sheduled for purging only included
the states dpkg passes through while actually purging the package – if
the package was fully installed before dpkg will pass first through all
remove states before purging it, so in the interest of consistent
reporting our progress reporting should do that, too.
create non-existent files in edit-sources with 644 instead of 640
If the sources file we want to edit doesn't exist yet GetLock will
create it with 640, which for a generic lockfile might be okay, but as
this is a sources file more relaxed permissions are in order – and
actually required as it wont be readable for unprivileged users causing
warnings/errors in apt calls.
report warnings&errors consistently in edit-sources
After editing the sources it is a good idea to (re)built the caches as
they will be out-of-date and doing so helps in reporting higherlevel
errors like duplicates sources.list entries, too, instead of just
general parsing errors as before.
The tests changes the sources.list and the modification time of this
file is considered while figuring out if the cache can be good. Usually
this isn't an issue, but in that case we have the cache generation
produce warnings which appear twice in this case.
clean up default-stanzas from extended_states on write
The existing cleanup was happening only for packages which had a status
change (install -> uninstalled) which is the most frequent but no the
only case – you can e.g. set autobits explicitly with apt-mark.
This would leave stanzas in the states file declaring a package to be
manually installed – which is the default value for a package not listed
at all, so we can just as well drop it from the file.
We support installing ./foo.deb (and ./foo.dsc for source) for a while
now, but it can be a bit clunky to work with those directly if you e.g.
build packages locally in a 'central' build-area.
The changes files also include hashsums and can be signed, so this can
also be considered an enhancement in terms of security as a user "just"
has to verify the signature on the changes file then rather than
checking all deb files individually in these manual installation
procedures.
allow arch=all to override No-Support-for-Architecture-all
If a user explicitly requests the download of arch:all apt shouldn't get
in the way and perform its detection dance if arch:all packages are
(also) in arch:any files or not.
This e.g. allows setting arch=all on a source with such a field (or one
which doesn't support all at all, but has the arch:all files like Debian
itself ATM) to get only the arch:all packages from there instead of
behaving like a no-op.
don't hardcode /var/lib/dpkg/status as dir::state::status
Theoretically it should be enough to change the Dir setting and have apt
pick the dpkg/status file from that. Also, it should be consistently
effected by RootDir. Both wasn't really the case through, so a user had
to explicitly set it too (or ignore it and have or not have expected
sideeffects caused by it).
This commit tries to guess better the location of the dpkg/status file
by setting dir::state::status to a naive "../dpkg/status", just that
this setting would be interpreted as relative to the CWD and not
relative to the dir::state directory. Also, the status file isn't really
relative to the state files apt has in /var/lib/apt/ as evident if we
consider that apt/ could be a symlink to someplace else and "../dpkg"
not effected by it, so what we do here is an explicit replace on apt/
– similar to how we create directories if it ends in apt/ – with dpkg/.
As this is a change it has the potential to cause regressions in so far
as the dpkg/status file of the "host" system is no longer used if you
set a "chroot" system via the Dir setting – but that tends to be
intended and causes people to painfully figure out that they had to set
this explicitly before, so that it now works more in terms of how the
other Dir settings work (aka "as expected"). If using the host status
file is really intended it is in fact easier to set this explicitely
compared to setting the new "magic" location explicitely.
Very unlikely, but if the parent is /dev/null, the child empty and the
grandchild a value we returned /dev/null/value which doesn't exist, so
hardly a problem, but for best operability we should be consistent in
our work and return /dev/null always.
tests: activate dpkg multi-arch even if test is single arch
Most tests are either multiarch, do not care for the specific
architecture or do not interact with dpkg, so really effect by this is
only test-external-installation-planner-protocol, but its a general
issue that while APT can be told to treat any architecture as native
dpkg has the native architecture hardcoded so if we run tests we must
make sure that dpkg knows about the architecture we will treat as
"native" in apt as otherwise dpkg will refuse to install packages from
such an architecture.
keep trying with next if connection to a SRV host failed
Instead of only trying the first host we get via SRV, we try them all as
we are supposed to and if that isn't working we try to connect to the
host itself as if we hadn't seen any SRV records. This was already the
intend of the old code, but it failed to hide earlier problems for the
next call, which would unconditionally fail then resulting in an all
around failure to connect. With proper stacking we can also keep the
error messages of each call around (and in the order tried) so if the
entire connection fails we can report all the things we have tried while
we discard the entire stack if something works out in the end.
report all instead of first error up the acquire chain
If we don't give a specific error to report up it is likely that all
error currently in the error stack are equally important, so reporting
just one could turn out to be confusing e.g. if name resolution failed
in a SRV record list.
don't change owner/perms/times through file:// symlinks
If we have files in partial/ from a previous invocation or similar such
those could be symlinks created by file:// sources. The code is
expecting only real files through and happily changes owner,
modification times and permission on the file the symlink points to
which tend to be files we have no business in touching in this way.
Permissions of symlinks shouldn't be changed, changing owner is usually
pointless to, but just to be sure we pick the easy way out and use
lchown, check for symlinks before chmod/utimes.
tests: disable EIPP logging in test-compressed-indexes
The test makes heavy use of disabling compression types which are
usually available some way or another like xz which is how the EIPP
logs are compressed by default. Instead of changing this test to change
the filename according to the compression we want to test we just
disable EIPP logging for this test as that is easier and has the same
practical effect.
give a descriptive error for pipe tries with 'false'
If libapt has builtin support for a compression type it will create a
dummy compressor struct with the Binary set to 'false' as it will catch
these before using the generic pipe implementation which uses the
Binary. The catching happens based on configured Names through, so you
can actually force apt to use the external binaries even if it would
usually use the builtin support. That logic fails through if you don't
happen to have these external binaries installed as it will fallback to
calling 'false', which will end in confusing 'Write error's.
So, this is again something you only encounter in constructed testing.
don't add default compressors two times if disabled
This is in so far pointless as the first match will deal with the
extension, so we don't actually ever use these second instances –
probably for the better as most need arguments to behave as epected &
more importantly: the point of the exercise disabling their use for
testing proposes.
use the right key for compressor configuration dump
The generated dump output is incorrect in sofar as it uses the name as
the key for this compressor, but they don't need to be equal as is the
case if you force some of the inbuilt ones to be disabled as our testing
framework does it at times.
This is hidden from changelog as nobody will actually notice while
describing it in a few words make it sound like an important change…
It can be handy to set apt options for the testcases which shouldn't be
accidentally committed like external planner testing or workarounds for
local setups.
use +0000 instead of UTC by default as timezone in output
All apt versions support numeric as well as 3-character timezones just
fine and its actually hard to write code which doesn't "accidently"
accepts it. So why change? Documenting the Date/Valid-Until fields in
the Release file is easy to do in terms of referencing the
datetime format used e.g. in the Debian changelogs (policy §4.4). This
format specifies only the numeric timezones through, not the nowadays
obsolete 3-character ones, so in the interest of least surprise we should
use the same format even through it carries a small risk of regression
in other clients (which encounter repositories created with
apt-ftparchive).
In case it is really regressing in practice, the hidden option
-o APT::FTPArchive::Release::NumericTimezone=0
can be used to go back to good old UTC as timezone.
The EDSP and EIPP protocols use this 'new' format, the text interface
used to communicate with the acquire methods does not for compatibility
reasons even if none of our methods would be effected and I doubt any
other would (in these instances the timezone is 'GMT' as that is what
HTTP/1.1 requires). Note that this is only true for apt talking to
methods, (libapt-based) methods talking to apt will respond with the
'new' format. It is therefore strongly adviced to support both also in
method input.
Debian isn't using 'update' anymore for years and the command is in
direct conflict with our goal of not requiring gnupg anymore, so it
is high time to officially declare this command as deprecated.
warn if apt-key is used in scripts/its output parsed
apt-key needs gnupg for most of its operations, but depending on it
isn't very efficient as apt-key is hardly used by users – and scripts
shouldn't use it to begin with as it is just a silly wrapper. To draw
more attention on the fact that e.g. 'apt-key add' should not be used in
favor of "just" dropping a keyring file into the trusted.gpg.d
directory this commit implements the display of warnings.
There is no real point in having two commands which roughly do the same
thing, especially if the difference is just in the display of the
fingerprint and hence security sensitive information.
As the volatile sources are parsed last they were sorted behind the
dpkg/status file and hence are treated as a downgrade, which isn't
really what you want to happen as from a user POV its an upgrade.
If we have a (e.g. locally built) deb file installed and do try to
install it again apt complained about this being a downgrade, but it
wasn't as it is the very same version… it was just confused into not
merging the versions together which looks like a downgrade then.
The same size assumption is usually good, but given that volatile files
are parsed last (even after the status file) the base assumption no
longer holds, but is easy to adept without actually changing anything in
practice.
protect only the latest same-source providers from autoremove
Traditionally all providers are protected providing something as apt
can't know which of them is actually really providing the functionality
for the user ensuring that we don't propose the removal of used stuff,
but that is of course also keeping stuff around which could be removed.
That can cause the collection of multiple old providers until the
provided package is itself no longer needed (e.g. out-of-tree kernel
modules). We combat this by marking providers only from the newest
source package version so that old providers built by older versions of
the same source package can be garbage collected.
more explicit MarkRequired algorithm code (part 2)
As the previous commit, this shouldn't change behavior at all, but
beside being more explicit and perhaps faster its also considerably
shorter (granted, mostly by if0-block elimination).
Piling everything in a single if statement always made my head wobble,
but it hasn't even a benefit as the most common case of a package which
isn't installed passes all of the old if and lands in the non-existent
else-part of the inner if. So beside a subjective cleanup of what goes
on this implementation should also be a bit faster.
write auto-bits before calling dpkg & again after if needed
Writing first means that even in the event of a power-failure the
autobit is saved for future processing instead of "forgotten" so that
the package is treated as manually installed.
In some cases we have to re-run the writing after dpkg is done through
as dpkg can let packages disappear and in such cases apt will move
autobits around (or in that case non-autobits) which we need to store.
if reading of autobit state failed, let write fail
If we can't read the old file we can't just move forward as that would
discard potentially discard old data (especially other fields). We let
it fail only after we are done writing the new file so a user has the
chance to look into and merge the new data (which is otherwise
discarded).
We deploy atomic renames for some files, but these renames also happen
if something about the file failed which isn't really the point of the
exercise…
Avoiding the use of GCC >= 5 stuff lets use go back to 4.8 simplifying
the travis setup again as well as reducing the backport requirements in
general.
if conf unset, don't read / as conf/pref/sources dir
Usually these config options are set to sensible values, but if init
isn't run or the user interferes with configuration clearing or similar
the options could indeed carry an empty value, which will result in
FindDir returning a '/'. That feels kinda wrong, but as a public
interface there isn't much we can do about it and instead make it so
that we get the special file /dev/null back we know how to deal with in
such cases.
Julian noticed on IRC that I fall victim to a lovely false friend by
calling referring to a 'planer' all the time even through these are
machines to e.g. remove splinters from woodwork ("make stuff plane").
The term I meant is written in german in this way (= with a single n)
but in english there are two, aka: 'planner'.
As that is unreleased code switching all instances without any
transitional provisions. Also the reason why its skipped in changelog.
Fix buffer overflow in debListParser::VersionHash()
If a package file is formatted in a way that that no space
follows a deprecated "<", we would reformat it to "<=" and
increase the length of the output by 1, which can break.
Under normal circumstances with "<=" this should not be an
issue.
In 385d9f2f23057bc5808b5e013e77ba16d1c94da4 I implemented the storage of
scenario files based on enabling this by default for EIPP, but I
implemented it first optionally for EDSP to have it independent.
The reasons mentioned in the earlier commit (debugging and bugreports)
obviously apply here, especially as EIPP solutions aren't user approved,
nearly impossible to verify before starting the execution and at the
time of error the scenario has changed already, so that reproducing the
issue becomes hard(er).
Freeing 'Install' for future use as an interface for "dpkg --install",
which is currently not used by any existent planer, so the
implementation of it itself will be delayed until then.
A rather special need option, but the internal planer supports this and
we have a testcase for it & sometimes it is hit (as a bug through). The
option itself mostly serves as a reminder for implementors that they
should be careful with removes and especially temporary removes if they
perform any.
APT has 3 modes: no immediate configuration, all packages are configured
immediately and its default mode of configuring essentials and
pseudo-essentials immediately only. While this seems like a job of
different planers at first, it might be handy to have it as an option,
too, in case a planer (like apts internal one) supports different modes
where the introduction of individual planers would be counter intuitive.
For the order there is no inherent difference between delete or purge,
so we don't tell the planer about this and instead decide in apt if a
package should be purged or not which also allows us to not tell the
planer about rc-only purges as we can trivially do this on our on as
there is no need to plan such purges.
eipp: provide the internal planer as an external one
Testing the current implementation can benefit from being able to be
feed an EIPP request and produce a fully compliant response. It is also
a great test for EIPP in general.
We can trim generation time and size of the EIPP scenario considerable
if we we avoid telling the planers about "uninteresting" packages.
This is one of the simpler but already very effective reductions:
Do not tell planers about versions which are neither installed nor are
to be installed as they have no effect on the plan we don't need to tell
the planer about them. EDSP solvers need to know about all versions for
better choice and error messages, but planers really don't.
The very first step in introducing the "external installation planer
protocol" (short: EIPP) as part of my GSoC2016 project.
The description reads: APT-based tools like apt-get, aptitude, synaptic,
… work with the user to figure out how their system should look like
after they are done installing/removing packages and their dependencies.
The actual installation/removal of packages is done by dpkg with the
constrain that dependencies must be fulfilled at any point in time (e.g.
to run maintainer scripts).
Historically APT has a super micro-management approach to this task
which hasn't aged that well over the years mostly ignoring changes in
dpkg and growing into an unmaintainable mess hardly anyone can debug and
everyone fears to touch – especially as more and more requirements are
tacked onto it like handling cycles and triggers, dealing with
"important" packages first, package sources on removable media, touch
minimal groups to be able to interrupt the process if needed (e.g.
unattended-upgrades) which not only sky-rocket complexity but also can
be mutually exclusive as you e.g. can't have minimal groups and minimal
trigger executions at the same time.
Seen in #828011 if we fail to parse a header field like Last-Modified we
end up interpreting the data as response header for coming requests in
case we don't rotate to a new server in DNS rotation.
In 3bdff17c894d0c3d0f813d358fc45d7a263f3552 we did it for the datetime
parsing, but we use the same style in the parsing for pdiff (where the
size of the file is in the middle of the three fields) so imbueing here
as well is a good idea.
wu-ftpd sends the response without parens, whereas we expect
them.
I did not test the patch, but it should work. I added another
return true if Pos is still npos after the second find to make
sure we don't add npos to the string.
Thanks: Lukasz Stelmach for the initial patch Closes: #420940
Rewritten in 9febc2b238e1e322dce1f94ecbed46d595893b52 for c++ locales
usage and rewritten again in 1d742e01470bba27715a8191c50adde4b39c2f19 to
avoid a currently present stdlibc++6 bug in the std::get_time
implementation. The later implementation uses still stringstreams for
parsing, but forgot to explicitly reset the locale to something sane
(for parsing english dates that is), so date and especially the parsing
of a number is depending on the locale. Turns out, the French (among
others) format their numbers with space as thousand separator so for
some reason the stdlibc++6 thinks its a good idea to interpret the
entire datetime string as a single number instead of realizing that in
"25 Jun …" the later parts can't reasonably be part of that number even
through there are spaces there…
add insecure (and weak) allow-options for sources.list
Weak had no dedicated option before and Insecure and Downgrade were both
global options, which given the effect they all have on security is
rather bad. Setting them for individual repositories only isn't great
but at least slightly better and also more consistent with other
settings for repositories.
ensure filesize of deb is included in the hashes list
Filesize is a silly hash all by itself, but in combination with others
it can be a strong opponent, so ensuring that it is in the list of
hashes and hence checked by the normal course of action the acquire
process takes is a good thing.
add [weak] tag to hash errors to indicate insufficiency
For "Hash Sum mismatch" that info doesn't make a whole lot of
difference, but for the new insufficient info message an indicator that
while this hashes are there and even match, they aren't enough from a
security standpoint.
Downloading and saying "Hash Sum mismatch" isn't very friendly from a
user POV, so with this change we try to detect such cases early on and
report it, preferably before download even started.
forbid insecure repositories by default expect in apt-get
With this commit all APT-based clients default to refusing to work with
unsigned or otherwise insufficently secured repositories. In terms of
apt and apt-get this changes nothing, but it effects all tools using
libapt like aptitude, synaptic or packagekit.
The exception remains apt-get for stretch for now as this might break
too many scripts/usecases too quickly.
The documentation is updated and extended to reflect how to opt out or
in on this behaviour change.
Handling the extra check (and force requirement) for downgrades in
security in our AllowInsecureRepositories checker helps in having this
check everywhere instead of just in the most common place and requiring
a little extra force in such cases is always good.
handle weak-security repositories as unauthenticated
APT can be forced to deal with repositories which have no security
features whatsoever, so just giving up on repositories which "just" fail
our current criteria of good security features is the wrong incentive.
Of course, repositories are better of fixing their setup to provide the
minimum of security features, but sometimes this isn't possible:
Historic repositories for example which do not change (anymore).
That also fixes problem with repositories which are marked as trusted,
but are providing only weak security features which would fail the
parsing of the Release file.
run update post-invokes even on (partial) failures
Unsecure repositories result in error messages by default which causes
the acquire run to fail hard, but non-failing repositories are still
updated just like in the slightly less hard-failures which got this
behaviour in 35664152e47a1d4d712fd52e0f0a2dc8ed359d32.
Dominic Benson [Mon, 20 Jun 2016 12:47:46 +0000 (13:47 +0100)]
Check for cached hash entries to determine which (if any) hash types
need to be generated for the current file.
In 1.0.9, each hash type was handled by a separate method, each of
which checked the cache. It looks like when these code paths were
unified (in a311fb96b84757ef8628e6a754232614a53b7891) the cache
checks were not incorporated into the new method.
implement and document DIRECT for auto-detect-proxy
There is a subtile difference between an empty setting and "DIRECT" in
the configuration as the later overrides the generic settings while the
earlier does not. Also, non-zero exitcodes should really be reported as
an error rather than silently discarded.
Commands are probably better of always having output through as the
fall through to the generic proxy settings is likely not intended. As
documenting and implementing this more consistently is kind of a
regression through, it is split off into the next commit.
avoid std::get_time usage to sidestep libstdc++6 bug
As reported upstream in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71556
the implementation of std::get_time is currently not as accepting as
strptime is, especially in how hours should be formatted.
Just reverting 9febc2b238e1e322dce1f94ecbed46d595893b52 would be
possible, but then we would reopen the problems fixed by it, so instead
I opted here for a rewrite of the parsing logic which makes this method
a lot longer, but at least it provides the same benefits as the rewrite
in std::get_time was intended to give us and decouples us from the fix
of the issue in the standard library implementation of GCC.