|
Home > Archive > Debian Developers > October 2005 > apt with index diff support
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
apt with index diff support
|
|
| Michael Vogt 2005-09-10, 8:52 pm |
| Dear Friends,
the idea to have some sort of incremental update support for the
archive index files (Packages,Sources) in the archive and in apt has
been around for some time now [1].
Anthony Towns analysed the problem in [2] and came up with the idea to
use ed-style diffs to solve the problem.
Andreas Barth implemented the server side of the index diffs
generation and has a test-repository (with only the index-files) at
[3].
I'm happy to tell you that apt is able to use those index files
now. Robert Lemmen and I implmeneted the needed support.
This massively speeds up a "apt-get update". If you e.g. update daily
you will have to get only ~15kb-30kb worth of update information per
day (instead of 2,7mb for the complete Packages.bz2 as it is now).
We believe that the code is now "good" enough for wider testing. I
have setup a repository for the patched version of apt with the pdiff
support. Just add this line to your sources.list file:
"deb http://people.debian.org/~mvo/apt/pdiffs/ /"
When fully supported by the archive, the diff support will be
completely transparent, no changes on your side necessary. In the
moment however, the archive does not carry the diff files, so a little
hack is needed to test this. The diffs are on merkel.debian.org, so change your
sources.list to point there instead of your usual mirror:
"http://merkel.debian.org/~aba/debian/"
E.g. from:
deb http://ftp.de.debian.org/debian sid main
to
deb http://merkel.debian.org/~aba/debian sid main
There are two issues with using merkel:
1. There is no Release/Release.gpg file on merkel so you get "not
authenticated" warnings
2. merkel does not have the actual package files
To work around (2) and make it possible to still download packages
(even with merkel as the only entry in sources.list) a hack was added
to apt called: "APT::URL-Remap::". This allows one to remap a
URI. Example: apt-get install 3dchess -o
APT::URL-Remap::http://merkel.debian.org/~aba/debia...de.debian.org/d
+ebian/"
will ensure that archives pointing to merkel are actually fetched from
ftp.de.debian.org. Please note that this hack will be removed once
there are servers available with both index diffs and packages (it's
just to make testing easier).
To work around the authentication issue you can use the
"--allow-unauthenticated" switch in apt, this will suppress the
authentication checking. Again once we have full support for the diffs
in the archive the authentification will work (and be checked).
Known issues:
- the progress-bar is a bit jumpy
- the total transfered data information (at the end of the update)
is totally incorrect
Please report success/failure/problems directly to me. There are two
debug switches that should be used if you are having trouble:
-o Debug::pkgAcquire::Diffs=true
-o Debug::pkgAcquire::RRed=true
The source code of the patch is availab in my:
michael.vogt@ubuntu.com--2005/apt--pdiff--0 [4]
baz repository.
Cheers,
Michael
[1] http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=128818
[2] http://azure.humbug.org.au/~aj/blog/2003/12/02
[3] http://merkel.debian.org/~aba/debian/
[4] at http://people.ubuntu.com/~mvo/arch/ubuntu
--
Linux is not The Answer. Yes is the answer. Linux is The Question. - Neo
--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
| |
| Andreas Metzler 2005-09-10, 8:52 pm |
| Michael Vogt <mvo@debian.org> wrote:
[...]
> 2. merkel does not have the actual package files
> To work around (2) and make it possible to still download packages
> (even with merkel as the only entry in sources.list) a hack was added
> to apt called: "APT::URL-Remap::". This allows one to remap a
> URI. Example: apt-get install 3dchess -o
> APT::URL-Remap::http://merkel.debian.org/~aba/debia...de.debian.org/d
> +ebian/"
Having to specify this at the commandline is messy, is there a way to put
this in /etc/apt.conf.d/? I've tried in vain using
APT::URL-Remap::http://merkel.debian.org/~aba/debian/ {"http://ftp.at.debian.org/debian";};
thanks, cu andreas
--
"See, I told you they'd listen to Reason," [SPOILER] Svfurlr fnlf,
fuhggvat qbja gur juveyvat tha.
Neal Stephenson in "Snow Crash"
--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
| |
| Daniel Burrows 2005-09-10, 8:53 pm |
| | |
| Andreas Metzler 2005-09-11, 2:48 am |
| Daniel Burrows <dburrows@debian.org> wrote:
> On Saturday 10 September 2005 07:46 am, Andreas Metzler wrote:
[vbcol=seagreen]
[vbcol=seagreen]
> I would expect that removing the braces would do the right thing.
> APT::URL-Remap::http://merkel.debian.org/~aba/debian/
> "http://ftp.at.debian.org/debian";
Thanks,
it does not help though; I still get:
(SID)root@argenau:/# apt-cache show apt
E: Syntax error /etc/apt/apt.conf.d/80incremental:2: Extra junk at end of file
cu andreas
PS: I got tht lots-of-curly-braces-idea from
/etc/apt/apt.conf.d/70debconf.
--
"See, I told you they'd listen to Reason," [SPOILER] Svfurlr fnlf,
fuhggvat qbja gur juveyvat tha.
Neal Stephenson in "Snow Crash"
--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
| |
| Michal Politowski 2005-09-11, 7:48 am |
| On Sun, 11 Sep 2005 09:18:57 +0200, Andreas Metzler wrote:
> Daniel Burrows <dburrows@debian.org> wrote:
>
>
>
>
> Thanks,
> it does not help though; I still get:
> (SID)root@argenau:/# apt-cache show apt
> E: Syntax error /etc/apt/apt.conf.d/80incremental:2: Extra junk at end offile
APT { URL-Remap::http://merkel.debian.org/~aba/debian/ "http://ftp.at.debian.org/debian"; };
at least parses (should the method of scoping make any difference, or is ita bug?),
but seems not to have any effect.
> PS: I got tht lots-of-curly-braces-idea from
> /etc/apt/apt.conf.d/70debconf.
DPkg::Pre-Install-Pkgs wants a list, this is the reason for braces, I think.
--
Micha³ Politowski
Talking has been known to lead to communication if practiced carelessly.
| |
| Michael Vogt 2005-09-11, 5:51 pm |
| Hi Andreas,
thanks for your feedback.
On Sat, Sep 10, 2005 at 04:46:09PM +0200, Andreas Metzler wrote:
> Michael Vogt <mvo@debian.org> wrote:
> [...]
[..][vbcol=seagreen]
> Having to specify this at the commandline is messy, is there a way to put
> this in /etc/apt.conf.d/? I've tried in vain using
>
> APT::URL-Remap::http://merkel.debian.org/~aba/debian/ {"http://ftp.at.debian.org/debian";};
Please add (in e.g. /etc/apt/apt.conf.d/50remap):
APT::URL-Remap::"http://merkel.debian.org/~aba/debian/" "http://ftp.de.debian.org/debian/";
Note the quotes around the uri, if they are omitted apt will consider
the // as the start of a comment.
Cheers,
Michael
--
Linux is not The Answer. Yes is the answer. Linux is The Question. - Neo
--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
| |
| Andreas Metzler 2005-09-12, 6:16 pm |
| On 2005-09-11 Michael Vogt <mvo@debian.org> wrote:
> On Sat, Sep 10, 2005 at 04:46:09PM +0200, Andreas Metzler wrote:
[...]
[vbcol=seagreen]
> Please add (in e.g. /etc/apt/apt.conf.d/50remap):
> APT::URL-Remap::"http://merkel.debian.org/~aba/debian/" "http://ftp.de.debian.org/debian/";
[...]
Thank you, this work nicely.
| Get:8 2005-09-11-2032.14.pdiff [8309B]
| Fetched 20.5MB in 7s (2619kB/s)
cu and- 80K downstream -reas
--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
| |
| Florian Weimer 2005-09-12, 6:16 pm |
| * Michael Vogt:
> Andreas Barth implemented the server side of the index diffs
> generation and has a test-repository (with only the index-files) at
> [3].
By the way, the secure-testing project on alioth contains a
moderately-tested pure-Python implementation of the client side
(without ed/red depedencies). It's not a speed daemon, but it
significantly cuts down network traffic.
Thanks a lot for providing the necessary infrastructure.
--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
| |
| Michael Vogt 2005-09-12, 6:16 pm |
| On Mon, Sep 12, 2005 at 10:11:34PM +0200, Florian Weimer wrote:
> * Michael Vogt:
>
> By the way, the secure-testing project on alioth contains a
> moderately-tested pure-Python implementation of the client side
> (without ed/red depedencies). It's not a speed daemon, but it
> significantly cuts down network traffic.
Just to avoid confusion (and because it wasn't mentioned in the
original mail), the apt implementation has it's own rred-method (in
methods/rred.cc) and does not need a external ed/red.
Cheers,
Michael
--
Linux is not The Answer. Yes is the answer. Linux is The Question. - Neo
--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
| |
| Andreas Metzler 2005-09-18, 7:48 am |
| Michael Vogt <mvo@debian.org> wrote:
[...]
> We believe that the code is now "good" enough for wider testing. I
> have setup a repository for the patched version of apt with the pdiff
> support. Just add this line to your sources.list file:
> "deb http://people.debian.org/~mvo/apt/pdiffs/ /"
[...]
> Example: apt-get install 3dchess -o
> APT::URL-Remap::http://merkel.debian.org/~aba/debia...de.debian.org/d
> +ebian/"
The remapping features seems not to work with deb-src entries.
----------------
(SID)ametzler@argenau:/tmp$ apt-get -o APT::URL-Remap::http://merkel.debian.org/~aba/debia...ian.org/debian/ source efax
Reading package lists... Done
Building dependency tree... Done
Need to get 114kB of source archives.
Err http://merkel.debian.org sid/main efax 1:0.9a-15 (dsc)
404 Not Found
[...]
Failed to fetch http://merkel.debian.org/~aba/debia...fax_0.9a-15.dsc 404 Not Found
[...]
E: Failed to fetch some archives.
----------------
"apt-get update"ing of deb-src entries seems to work well, though.
Just the actual fetching of source packages is broken.
cu andreas
--
"See, I told you they'd listen to Reason," [SPOILER] Svfurlr fnlf,
fuhggvat qbja gur juveyvat tha.
Neal Stephenson in "Snow Crash"
--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
| |
| Miles Bader 2005-09-20, 2:50 am |
| Wow, it works really well!! I used to dread doing an "update" because
of my slow dialup but now it's fast as could wish for (of course
downloading the packages is still slow, but at least I can now quickly
check to see if there are any interesting changes or not).
Thank you very much for this work... Hopefully it will make it into the
main distro quickly.
-miles
--
"Suppose He doesn't give a shit? Suppose there is a God but He
just doesn't give a shit?" [George Carlin]
--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
| |
| Michael Vogt 2005-09-20, 6:06 pm |
| On Sun, Sep 18, 2005 at 01:33:29PM +0200, Andreas Metzler wrote:
> Michael Vogt <mvo@debian.org> wrote:
[..]
> The remapping features seems not to work with deb-src entries.
>
> ----------------
> (SID)ametzler@argenau:/tmp$ apt-get -o APT::URL-Remap::http://merkel.debian.org/~aba/debia...ian.org/debian/ source efax
> Reading package lists... Done
> Building dependency tree... Done
> Need to get 114kB of source archives.
> Err http://merkel.debian.org sid/main efax 1:0.9a-15 (dsc)
> 404 Not Found
> [...]
> Failed to fetch http://merkel.debian.org/~aba/debia...fax_0.9a-15.dsc 404 Not Found
> [...]
> E: Failed to fetch some archives.
It's a known issue that URL-Remap only works for binary packages.
It shouldn't be hard to extend it for source packages as well. But I
wonder if it is worth the efford. Once the index diffs are supported
on the normal debian server the URL-Remap code will go away again.
Cheers,
Michael
--
Linux is not The Answer. Yes is the answer. Linux is The Question. - Neo
--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
| |
| Andreas Barth 2005-09-20, 6:06 pm |
| * Michael Vogt (mvo@debian.org) [050910 15:57]:
> I'm happy to tell you that apt is able to use those index files
> now. Robert Lemmen and I implmeneted the needed support.
you might be interessted that secure-testing.debian.net has now "native"
support for index diffs (even though only very short, and still in test
mode).
Cheers,
Andi
--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
| |
| Florian Weimer 2005-10-29, 5:52 pm |
| * Michael Vogt:
> When fully supported by the archive, the diff support will be
> completely transparent, no changes on your side necessary.
When downloading the Index file, APT does not send instructions to
bypass the proxy cache:
T 212.9.189.177:51459 -> 212.9.189.169:9999 [AP]
GET /debian/dists/experimental/main/source/Sources.diff/Index HTTP/1.1..Hos
t: proxy.enyo.de:9999..Connection: keep-alive..User-Agent: Debian APT-HTTP/
1.3....GET /debian/dists/unstable/main/binary-i386/Packages.bz2 HTTP/1.1..H
ost: proxy.enyo.de:9999..Connection: keep-alive..Range: bytes=246835-..If-R
ange: Fri, 28 Oct 2005 19:18:20 GMT..User-Agent: Debian APT-HTTP/1.3....
(This is with apt 0.6.42.1exp1.)
This means that apt-proxy (and probably other proxies as well) returns
a stale copy. Sure, this could be fixed on the proxy side, but I
think to actually achieve the transparency you are aiming at, you have
to implement a workaround in APT.
(Apart from that, I can only repeated what I already wrote a couple of
weeks ago: Nice work.)
--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
| |
| Adam Heath 2005-10-29, 5:52 pm |
| On Sat, 29 Oct 2005, Florian Weimer wrote:
[vbcol=seagreen]
> * Michael Vogt:
>
The index support is broken for file urls. It works the first time(ie,
removing all the files in /var/lib/apt/lists), but then fails with this error:
E: Could not open file
/var/lib/apt/lists/ _mirror_debian_dists_unstable_main_binar
y-i386_Packages.IndexDiff
- open (2 No such file or directory)
--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
| |
| Goswin von Brederlow 2005-10-30, 7:48 am |
| Florian Weimer <fw@deneb.enyo.de> writes:
> * Michael Vogt:
>
>
> When downloading the Index file, APT does not send instructions to
> bypass the proxy cache:
>
> T 212.9.189.177:51459 -> 212.9.189.169:9999 [AP]
> GET /debian/dists/experimental/main/source/Sources.diff/Index HTTP/1.1..Hos
> t: proxy.enyo.de:9999..Connection: keep-alive..User-Agent: Debian APT-HTTP/
> 1.3....GET /debian/dists/unstable/main/binary-i386/Packages.bz2 HTTP/1.1..H
> ost: proxy.enyo.de:9999..Connection: keep-alive..Range: bytes=246835-..If-R
> ange: Fri, 28 Oct 2005 19:18:20 GMT..User-Agent: Debian APT-HTTP/1.3....
>
> (This is with apt 0.6.42.1exp1.)
>
> This means that apt-proxy (and probably other proxies as well) returns
> a stale copy. Sure, this could be fixed on the proxy side, but I
> think to actually achieve the transparency you are aiming at, you have
> to implement a workaround in APT.
Doesn't the 'If-Range: Fri, 28 Oct 2005 19:18:20 GMT' mean the proxy
should query upstream? Also what is the time to live on the reply? And
shouldn't the proxy always query for an update if the TTL is gone?
I don't want to download the diff every time. Not when I update a
32 Nodes cluster over one proxy within minutes. So just plainly
baypassing the proxy seems wrong.
> (Apart from that, I can only repeated what I already wrote a couple of
> weeks ago: Nice work.)
MfG
Goswin
PS: Is the index file listed in Release?
--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
| |
| Andreas Metzler 2005-10-30, 7:48 am |
| Goswin von Brederlow <brederlo@informatik.uni-tuebingen.de> wrote:
[...]
> PS: Is the index file listed in Release?
Quoting http://ftp.at.debian.org/debian/dists/sid/Release
[...]
41c3cd8cf26c2eba317c37dfe9d0d3bf 2761295 main/binary-alpha/Packages.bz2
18680c3a635a695e01f69166daef4148 2177 main/binary-alpha/Packages.diff/Index
3af8fbafe3e4538520989de964289712 83 main/binary-alpha/Release
[...]
cu andreas
--
The 'Galactic Cleaning' policy undertaken by Emperor Zhark is a personal
vision of the emperor's, and its inclusion in this work does not constitute
tacit approval by the author or the publisher for any such projects,
howsoever undertaken. (c) Jasper Ffforde
--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
| |
| Florian Weimer 2005-10-30, 7:48 am |
| * Goswin von Brederlow:
> Doesn't the 'If-Range: Fri, 28 Oct 2005 19:18:20 GMT' mean the proxy
> should query upstream? Also what is the time to live on the reply? And
> shouldn't the proxy always query for an update if the TTL is gone?
Indeed, I should have looked at the ngrep output more closely. It
turns out that this clearly apt-proxy's fault. The patch below should
fix this (it's only lightly tested, though).
diff -rN -u old-apt-proxy-1.9.32/apt_proxy/apt_proxy.py new-apt-proxy-1.9.32/apt_proxy/apt_proxy.py
--- old-apt-proxy-1.9.32/apt_proxy/apt_proxy.py 2005-10-30 11:04:52.000000000 +0100
+++ new-apt-proxy-1.9.32/apt_proxy/apt_proxy.py 2005-10-30 11:04:52.000000000 +0100
@@ -89,8 +89,8 @@
FileType(re.compile(r"\.txt$"), "application/plain-text", 1),
FileType(re.compile(r"\.html$"), "application/text-html", 1),
- FileType(re.compile(r"/(Packages|Release(\.gpg)?|Sources|Contents-.*)"
- r"(\.(gz|bz2))?$"),
+ FileType(re.compile(r"/(Packages|Release(\.gpg)?|Sources|Index"
+ r"|(?:Contents|Translation)-.*)(\.(gz|bz2))?$"),
"text/plain", 1),
FileType(re.compile(r"\.rpm$"), "application/rpm", 0),
diff -rN -u old-apt-proxy-1.9.32/debian/changelog new-apt-proxy-1.9.32/debian/changelog
--- old-apt-proxy-1.9.32/debian/changelog 2005-10-30 11:04:52.000000000 +0100
+++ new-apt-proxy-1.9.32/debian/changelog 2005-10-30 11:04:52.000000000 +0100
@@ -1,3 +1,10 @@
+apt-proxy (1.9.32.1) unstable; urgency=low
+
+ * Non-maintainer upload
+ * Map type of "Index" and "Translation-*" files to text/plain.
+
+ -- Florian Weimer <fw@deneb.enyo.de> Sun, 30 Oct 2005 11:04:48 +0100
+
apt-proxy (1.9.32) unstable; urgency=low
[ Chris Halls ]
--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
|
|
|
|
|