https: Quote path in URL before passing it to curl
Curl requires URLs to be urlencoded. We are however giving it
undecoded URLs. This causes it go completely nuts if there is
a space in the URI, producing requests like:
GET /a file HTTP/1.1
which the servers then interpret as a GET request for "/a" with
HTTP version "file" or some other non-sense.
This works around the issue by encoding the path component of
the URL. I'm not sure if we should encode other parts of the URL
as well, this one seems to do the trick for the actual issue at
hand.
A more correct fix is to avoid the dequoting and (re-)quoting
of URLs when a redirect occurs / a new request is sent. That's
been on the radar for probably a year or two now, but nobody
bothered implementing that yet.
This is a follow up to the previous issue where we did not check
if getline() returned -1 due to an end of file or due to an error
like memory allocation, treating both as end of file.
Here we ensure that we also handle buffered writes correctly by
flushing the files before checking for any errors in our error
stack.
Buffered writes themselves were introduced in 1.1.9, but the
function was never called with a buffered file from inside
apt until commit 46c4043d741cb2c1d54e7f5bfaa234f1b7580f6c
which was first released with apt 1.2.10. The function is
public, though, so fixing this is a good idea anyway.
SECURITY UPDATE: gpgv: Check for errors when splitting files (CVE-2016-1252)
This fixes a security issue where signatures of the
InRelease files could be circumvented in a man-in-the-middle
attack, giving attackers the ability to serve any packages
they want to a system, in turn giving them root access.
It turns out that getline() may not only return EINVAL
as stated in the documentation - it might also return
in case of an error when allocating memory.
This fix not only adds a check that reading worked
correctly, it also implicitly checks that all writes
worked by reporting any other error that occurred inside
the loop and was logged by apt.
In 105503b4b470c124bc0c271bd8a50e25ecbe9133 we got a warning implemented
for unreadable files which greatly improves the behavior of apt update
already as everything will work as long as we don't need the keys
included in these files. The behavior if they are needed is still
strange through as update will fail claiming missing keys and a manual
test (which the user will likely perform as root) will be successful.
Passing the new warning generated by apt-key through to apt is a bit
strange from an interface point of view, but basically duplicating the
warning code in multiple places doesn't feel right either. That means we
have no translation for the message through as apt-key has no i18n yet.
It also means that if the user has a bunch of sources each of them will
generate a warning for each unreadable file which could result in quite
a few duplicated warnings, but "too many" is better than none.
apt-key: warn instead of fail on unreadable keyrings
apt-key has inconsistent behaviour if it can't read a keyring file:
Commands like 'list' skipped silently over such keyrings while 'verify'
failed hard resulting in apt to report cconfusing gpg errors (#834973).
As a first step we teach apt-key to be more consistent here skipping in
all commands over unreadable keyrings, but issuing a warning in the
process, which is as usual for apt commands displayed at the end of the
run.
(cherry picked from commit 105503b4b470c124bc0c271bd8a50e25ecbe9133)
(removed the buffering of warnings in aptwarnings.log, as we do not
have a cleanup function where we can cat it)
travis: Pull in c++ standard library and g++ 5 from wily
The one in trusty does not support std::put_time(), causing the
compile to fail. This commit is specific to the 1.2 branch, as
newer branches already pull this in automatically.
Use C locale instead of C.UTF-8 for protocol strings
The C.UTF-8 locale is not portable, so we need to use C, otherwise
we crash on other systems. We can use std::locale::classic() for
that, which might also be a bit cheaper than using locale("C").
In 3bdff17c894d0c3d0f813d358fc45d7a263f3552 we did it for the datetime
parsing, but we use the same style in the parsing for pdiff (where the
size of the file is in the middle of the three fields) so imbueing here
as well is a good idea.
Rewritten in 9febc2b238e1e322dce1f94ecbed46d595893b52 for c++ locales
usage and rewritten again in 1d742e01470bba27715a8191c50adde4b39c2f19 to
avoid a currently present stdlibc++6 bug in the std::get_time
implementation. The later implementation uses still stringstreams for
parsing, but forgot to explicitly reset the locale to something sane
(for parsing english dates that is), so date and especially the parsing
of a number is depending on the locale. Turns out, the French (among
others) format their numbers with space as thousand separator so for
some reason the stdlibc++6 thinks its a good idea to interpret the
entire datetime string as a single number instead of realizing that in
"25 Jun …" the later parts can't reasonably be part of that number even
through there are spaces there…
avoid std::get_time usage to sidestep libstdc++6 bug
As reported upstream in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71556
the implementation of std::get_time is currently not as accepting as
strptime is, especially in how hours should be formatted.
Just reverting 9febc2b238e1e322dce1f94ecbed46d595893b52 would be
possible, but then we would reopen the problems fixed by it, so instead
I opted here for a rewrite of the parsing logic which makes this method
a lot longer, but at least it provides the same benefits as the rewrite
in std::get_time was intended to give us and decouples us from the fix
of the issue in the standard library implementation of GCC.
accept only the expected UTC timezones in date parsing
HTTP/1.1 hardcodes GMT (RFC 7231 §7.1.1.1) and what is good enough for the
internet must be good enough for us™ as we reuse the implementation
internally to parse (most) dates we encounter in various places like the
Release files with their Date and Valid-Until header fields.
Implementing a fully timezone aware parser just feels too hard for no
effective benefit as it would take 5+ years (= until LTS's are out of
fashion) until a repository could use non-UTC dates and expect it to
work. Not counting non-apt implementations which might or might not
only want to encounter UTC here as well.
As a bonus, this eliminates the use of an instance of setlocale in
libapt.
don't leak FD in AutoProxyDetect command return parsing
which accidentally made the proxy autodetection code also read
the scripts output on stderr, not only on stdout when it switched
the code from popen() to Popen().
VersionHash: Do not skip too long dependency lines
If the dependency line does not contain spaces in the repository
but does in the dpkg status file (because dpkg normalized the
dependency list), the dpkg line might be longer than the line
in the repository. If it now happens to be longer than 1024
characters, it would be skipped, causing the hashes to be
out of date.
Note that we have to bump the minor cache version again as
this changes the format slightly, and we might get mismatches
with an older src cache otherwise.
Commit 3af3ac2f5ec007badeded46a94be2bd06b9917a2 (released in 1.3~pre1)
implements proper fallback for SRV, but that works actually too good
as the RFC defines that such an SRV record should indicate that the
server doesn't provide this service and apt should respect this.
The solution is hence to fail again as requested even if that isn't what
the user (and perhaps even the server admins) wanted. At least we will
print a message now explicitly mentioning SRV to point people in the
right direction.
try not to call memcpy with length 0 in hash calculations
memcpy is marked as nonnull for its input, but ignores the input anyhow
if the declared length is zero. Our SHA2 implementations do this as
well, it was "just" MD5 and SHA1 missing, so we add the length check
here as well as along the callstack as it is really pointless to do all
these method calls for "nothing".
If the inner Base256ToNum() returned false, it did not set
Num to a new value, causing it to be uninitialized, and thus
might have caused the function to exit despite a good result.
Also document why the Res = Num, if (Res != Num) magic is done.
TagFile: Fix off-by-one errors in comment stripping
Adding 1 to the value of d->End - current makes restLength one byte
too long: If we pass memchr(current, ..., restLength) has thus
undefined behavior.
Also, reading the value of current has undefined behavior if
current >= d->End, not only for current > d->End:
Consider a string of length 1, that is d->End = d->Current + 1.
We can only read at d->Current + 0, but d->Current + 1 is beyond
the end of the string.
This probably caused several inexplicable build failures on hurd-i386
in the past, and just now caused a build failure on Ubuntu's amd64
builder.
Fix segfault and out-of-bounds read in Binary fields
If a Binary field contains one or more spaces before a comma, the
code produced a segmentation fault, as it accidentally set a pointer
to 0 instead of the value of the pointer.
If the comma is at the beginning of the field, the code would
create a binStartNext that points one element before the start
of the string, which is undefined behavior.
We also need to check that we do not exit the string during the
replacement of spaces before commas: A string of the form " ,"
would normally exit the boundary of the Buffer:
don't loop on pinning pkgs from absolute debs by regex
An absolute filename for a *.deb file starts with a /. A package with
the name of the file is inserted in the cache which is provided by the
"real" package for internal reasons. The pinning code detects a regex
based wildcard by having the regex start with /. That is no problem
as a / can not be included in a package name… expect that our virtual
filename package can and does.
We fix this two ways actually: First, a regex is only being considered a
regex if it also ends with / (we don't support flags). That stops our
problem with the virtual filename packages already, but to be sure we
also do not enter the loop if matcher and package name are equal.
It has to be noted that the creation of pins for virtual packages like
the here effected filename packages is pointless as only versions can be
pinned, but checking that a package is really purely virtual is too
costly compared to just creating an unused pin.
Instead of erroring out when receiving a SIGINT, let the
child deal with it - we'll error out anyway if the child
exits with an error or due to the signal. Also ignore
SIGQUIT, as system() ignores it.
This basically fixes Bug #832593, but: we are running
the hooks via sh -c. Some shells exit with a signal
error even if the command they are executing catches
the signal and exits successfully. So far, this has
been noticed on dash, which unfortunately, is our
default shell.
Example:
$ cat trap.sh
trap 'echo int' INT; sleep 10; exit 0
$ if dash -c ./trap.sh; then echo OK: $?; else echo FAIL: $?; fi
^Cint
FAIL: 130
$ if mksh -c ./trap.sh; then echo OK: $?; else echo FAIL: $?; fi
^Cint
OK: 0
$ if bash -c ./trap.sh; then echo OK: $?; else echo FAIL: $?; fi
^Cint
OK: 0
In af81ab9030229b4ce6cbe28f0f0831d4896fda01 we implement by-hash as a
special compression type, which breaks this filesize setting as the code
is looking for a foobar.by-hash file then. Dealing this slightly gets
us the intended value. Note that this has no direct effect as this value
will be set in other ways, too, and could only effect progress reporting.
If a server closes a connection after sending us a file that tends to
mean that its a type of server who always closes the connection – it is
therefore relatively pointless to try pipelining with it even if it
isn't a problem by itself: apt is just restarting the pipeline each
time after it got served one file and the connection is closed.
The problem starts if one or more proxies are between the server and apt
and they disagree about how the connection should be as in the
bugreporters case where the responses apt gets contain both Keep-Alive
and Proxy-Connection headers (which apt both ignores) indicating a
proxy is trying to keep a connection open while the response also
contains "Connection: close" indicating the opposite which apt
understands and respects as it is required to do.
We avoid stepping into this abyss by not performing pipelining anymore
if we got a respond with the indication to close connection if the
response was otherwise a success – error messages are sent by some
servers via this method as their pages tend to be created dynamically
and hence their size isn't known a priori to them.
It seems completely pointless from a server-POV to sent empty header
fields, so most of them don't do it (simply proven by this limitation
existing since day one) – but it is technically allowed by the RFC as
the surounding whitespaces are optional and Github seems to like sending
"X-Geo-Block-List:\r\n" since recently (bug reports in other http
clients indicate July) at least sometimes as the reporter claims to have
seen it on https only even through it can happen with both.
drop incorrect const attribute from DirectoryExists
Since its existence in 2010 DirectoryExists was always marked with this
attribute, but for no real reason. Arguably a check for the existence of
the file is not modifying global state, so theoretically this shouldn't
be a problem. It is wrong from a logical point of view through as
between two calls the directory could be created so the promise we made
to the compiler that it could remove the second call would be wrong, so
API wise it is wrong.
It's a bit mysterious that this is only observeable on ppc64el and can be
fixed by reordering code ever so slightly, but in the end its more our
fault for adding this attribute than the compilers fault for doing
something silly based on the attribute.
pass --force-remove-essential to dpkg only if needed
APT (usually) knows which package is essential or not, so we can avoid
passing this force flag to dpkg unconditionally if the user hasn't
chosen a non-default essential handling obscuring the information.
if the FileFd failed already following calls should fail, too
There is no point in trying to perform Write/Read on a FileFd which
already failed as they aren't going to work as expected, so we should
make sure that they fail early on and hard.
If another file in the transaction fails and hence dooms the transaction
we can end in a situation in which a -patched file (= rred writes the
result of the patching to it) remains in the partial/ directory.
The next apt call will perform the rred patching again and write its
result again to the -patched file, but instead of starting with an empty
file as intended it will override the content previously in the file
which has the same result if the new content happens to be longer than
the old content, but if it isn't parts of the old content remain in the
file which will pass verification as the new content written to it
matches the hashes and if the entire transaction passes the file will be
moved the lists/ directory where it might or might not trigger errors
depending on if the old content which remained forms a valid file
together with the new content.
This has no real security implications as no untrusted data is involved:
The old content consists of a base file which passed verification and a
bunch of patches which all passed multiple verifications as well, so the
old content isn't controllable by an attacker and the new one isn't
either (as the new content alone passes verification). So the best an
attacker can do is letting the user run into the same issue as in the
report.
We read the entire input file we want to patch anyhow, so we can also
calculate the hash for that file and compare it with what he had
expected it to be.
Note that this isn't really a security improvement as a) the file we
patch is trusted & b) if the input is incorrect, the result will hardly be
matching, so this is just for failing slightly earlier with a more
relevant error message (althrough, in terms of rred its ignored and
complete download attempt instead).
The flush call is a no-op in most FileFd implementations so this isn't
as critical as it might sound as the only non-trivial implementation is
in the buffered writer, which tends not be used to buffer another
buffer…
keep trying with next if connection to a SRV host failed
Instead of only trying the first host we get via SRV, we try them all as
we are supposed to and if that isn't working we try to connect to the
host itself as if we hadn't seen any SRV records. This was already the
intend of the old code, but it failed to hide earlier problems for the
next call, which would unconditionally fail then resulting in an all
around failure to connect. With proper stacking we can also keep the
error messages of each call around (and in the order tried) so if the
entire connection fails we can report all the things we have tried while
we discard the entire stack if something works out in the end.
report all instead of first error up the acquire chain
If we don't give a specific error to report up it is likely that all
error currently in the error stack are equally important, so reporting
just one could turn out to be confusing e.g. if name resolution failed
in a SRV record list.
don't change owner/perms/times through file:// symlinks
If we have files in partial/ from a previous invocation or similar such
those could be symlinks created by file:// sources. The code is
expecting only real files through and happily changes owner,
modification times and permission on the file the symlink points to
which tend to be files we have no business in touching in this way.
Permissions of symlinks shouldn't be changed, changing owner is usually
pointless to, but just to be sure we pick the easy way out and use
lchown, check for symlinks before chmod/utimes.
use the right key for compressor configuration dump
The generated dump output is incorrect in sofar as it uses the name as
the key for this compressor, but they don't need to be equal as is the
case if you force some of the inbuilt ones to be disabled as our testing
framework does it at times.
This is hidden from changelog as nobody will actually notice while
describing it in a few words make it sound like an important change…
As the volatile sources are parsed last they were sorted behind the
dpkg/status file and hence are treated as a downgrade, which isn't
really what you want to happen as from a user POV its an upgrade.
If we have a (e.g. locally built) deb file installed and do try to
install it again apt complained about this being a downgrade, but it
wasn't as it is the very same version… it was just confused into not
merging the versions together which looks like a downgrade then.
The same size assumption is usually good, but given that volatile files
are parsed last (even after the status file) the base assumption no
longer holds, but is easy to adept without actually changing anything in
practice.
protect only the latest same-source providers from autoremove
Traditionally all providers are protected providing something as apt
can't know which of them is actually really providing the functionality
for the user ensuring that we don't propose the removal of used stuff,
but that is of course also keeping stuff around which could be removed.
That can cause the collection of multiple old providers until the
provided package is itself no longer needed (e.g. out-of-tree kernel
modules). We combat this by marking providers only from the newest
source package version so that old providers built by older versions of
the same source package can be garbage collected.
more explicit MarkRequired algorithm code (part 2)
As the previous commit, this shouldn't change behavior at all, but
beside being more explicit and perhaps faster its also considerably
shorter (granted, mostly by if0-block elimination).
Piling everything in a single if statement always made my head wobble,
but it hasn't even a benefit as the most common case of a package which
isn't installed passes all of the old if and lands in the non-existent
else-part of the inner if. So beside a subjective cleanup of what goes
on this implementation should also be a bit faster.
factor out Pkg/DepIterator prettyprinters into own header
The old prettyprinters have only access to the struct they pretty print,
which isn't enough usually as we want to know for a package also a bit
of state information like which version is the candidate.
We therefore need to pull the DepCache into context and hence use a
temporary struct which is printed instead of the iterator itself.
write auto-bits before calling dpkg & again after if needed
Writing first means that even in the event of a power-failure the
autobit is saved for future processing instead of "forgotten" so that
the package is treated as manually installed.
In some cases we have to re-run the writing after dpkg is done through
as dpkg can let packages disappear and in such cases apt will move
autobits around (or in that case non-autobits) which we need to store.
if reading of autobit state failed, let write fail
If we can't read the old file we can't just move forward as that would
discard potentially discard old data (especially other fields). We let
it fail only after we are done writing the new file so a user has the
chance to look into and merge the new data (which is otherwise
discarded).
We deploy atomic renames for some files, but these renames also happen
if something about the file failed which isn't really the point of the
exercise…
Fix buffer overflow in debListParser::VersionHash()
If a package file is formatted in a way that that no space
follows a deprecated "<", we would reformat it to "<=" and
increase the length of the output by 1, which can break.
Under normal circumstances with "<=" this should not be an
issue.
Seen in #828011 if we fail to parse a header field like Last-Modified we
end up interpreting the data as response header for coming requests in
case we don't rotate to a new server in DNS rotation.
wu-ftpd sends the response without parens, whereas we expect
them.
I did not test the patch, but it should work. I added another
return true if Pos is still npos after the second find to make
sure we don't add npos to the string.
ensure filesize of deb is included in the hashes list
Filesize is a silly hash all by itself, but in combination with others
it can be a strong opponent, so ensuring that it is in the list of
hashes and hence checked by the normal course of action the acquire
process takes is a good thing.
Dominic Benson [Mon, 20 Jun 2016 12:47:46 +0000 (13:47 +0100)]
Reinstate caching of file hashes in apt-ftparchive
Check for cached hash entries to determine which (if any) hash types
need to be generated for the current file.
In 1.0.9, each hash type was handled by a separate method, each of
which checked the cache. It looks like when these code paths were
unified (in a311fb96b84757ef8628e6a754232614a53b7891) the cache
checks were not incorporated into the new method.
Commands are probably better of always having output through as the
fall through to the generic proxy settings is likely not intended. As
documenting and implementing this more consistently is kind of a
regression through, it is split off into the next commit.
apt-key: change to / before find to satisfy its CWD needs
First seen on hurd, but easily reproducible on all systems by removing
the 'execution' bit from the current working directory and watching some
tests (mostly the no-output expecting tests) fail due to find printing:
"find: Failed to restore initial working directory: …"
Samuel Thibault says in the bugreport:
| To do its work, find first records the $PWD, then goes to
| /etc/apt/trusted.gpg.d/ to find the files, and then goes back to $PWD.
|
| On Linux, getting $PWD from the 700 directory happens to work by luck
| (POSIX says that getcwd can return [EACCES]: Search permission was denied
| for the current directory, or read or search permission was denied for a
| directory above the current directory in the file hierarchy). And going
| back to $PWD fails, and thus find returns 1, but at least it emitted its
| output.
|
| On Hurd, getting $PWD from the 700 directory fails, and find thus aborts
| immediately, without emitting any output, and thus no keyring is found.
|
| So, to summarize, the issue is that since apt-get update runs find as a
| non-root user, running it from a 700 directory breaks find.
Solved as suggested by changing to '/' before running find, with some
paranoia extra care taking to ensure the paths we give to find are really
absolute paths first (they really should, but TMPDIR=. or a similar
Dir::Etc::trustedparts setting could exist somewhere in the wild).
The commit takes also the opportunity to make these lines slightly less
error ignoring and the two find calls using (mostly) the same parameters.
Setting the C++ locale via std::locale::global(std::locale("")); which
would otherwise default to the default C locale (aka: unaffected by
setlocale) effects the formatting of numeric types in IO streams, which
for output for humans is perfectly sensible, but breaks our many text
interfaces used and parsed by us and others without expecting the
numbers to be formatted.
This fixes Debian/apt#13 and the launchpad bug listed below,
but is far more advanced. I went through private-cmndline.cc
and looked at the supported options.
fail instead of segfault on unreadable config files
The report mentions "apt list --upgradable", but there are others which
have inconsistent behavior ranging from segfaulting to doing something
with the partial (and hence incomplete) data. We had a recent report
about sources.list (#818628), this one mentions prefences, the obvious
next step is conf files… so the testcase is adapted to check for all
three in file and directory versions and run a bunch of commands each
time which should all have more or less the same behavior in such a case
(aka error out).
respect user pinning in M-A:same version (un)screwing
Using Pkg.CandVersion() here is wrong as its implementation will return
a candidate based just on the default policy settings ignoring user
preferences and otherwise set candidates (aka: it sidesteps the
pkgDepCache).
This causes M-A:same libraries to be detected as screwed even through
they aren't, so that they end up being kept back.