Linux – Running a Gentoo distfiles caching mirror on Debian

debiangentoohttp-proxylinux

I have a variety of Linux hosts on my office LAN. I run apt-cacher-ng on a box to cache packagae downloads for all of the Debian and Ubuntu machines on the network. We have a few Gentoo users and I would like to cache their distfiles downloads as well.

I am already running an rsync mirror for Gentoo, and that has proven to be an easy setup and reliable.

What I would like is something like http-replicator but that is actually maintained and has a Debian Squeeze package available. I've looked at Squid and it was just too much, I would like something simpler. I also looked at Polipo and that seemed to be on the right track, but suffered this fatal flaw.

All of the distfiles on the Gentoo mirrors are the same, but if you attempted to download the same file from a different source mirror, Polipo would think it was a different file, resulting in a cache miss. http-replicator didn't suffer this issue, and since I don't administrate all of the Gentoo boxes, I don't think I can guarantee a high level of compliance on mirror selection, since most people just do it with mirrorselect, anyway.

So I'm looking for something that is:

  1. Pretty easy to set up and doesn't require too much fiddling or complicated cache-expiring setups
  2. Can act as a transparent HTTP proxy
  3. Will deliver the same local file, even if it is being "downloaded" from a different server
  4. Doesn't require mirroring of the entire collection of all Gentoo distfiles

Is this too much to ask?

Best Answer

You can use use apt-cacher-ng easily.

Remap-gentoo: file:gentoo_mirrors http://distfiles.gentoo.org/ /gentoo ; file:backends_gentoo # Gentoo Archives

  • In the file gentoo_mirrors, put all of the mirrors you want to capture.
  • In the file backends_gentoo, put the backup mirror you want to use for fetching.

Here's a script to create gentoo_mirrors

# This fetches the live Gentoo mirrors list
# robbat2@gentoo.org - 2013/Dec/03
OUTFILE=gentoo_mirrors
URL=http://www.gentoo.org/main/en/mirrors3.xml
wget --save-headers -q $URL -O - \
| sed -n \
-e '/^[A-Z]/{s,^,#,g;p}' \
-e '/<mirrorgroup/{s,^,\n#,g;p}' \
-e '/<name/{s,^,#,g;p}' \
-e '/<uri/{/protocol="http"/{s/.*<uri[^>]\+>//g;s/<\/uri>//g;p}}' \
>$OUTFILE

Source: I'm a senior Gentoo developer, and run the Gentoo infrastructure. I have submitted a variant on the above to the upstream apt-cacher-ng author.