Centos – squid and caching of dnf/yum downloads

centosfedorasquid

Sorry if this is a newbie question. I try to describe the situation first, then the squid questin will come in.

The current Fedora/Centos installations have in their normal configuration files in /etc/yum.repos.conf a metalink which looks like this.

metalink=https://mirrors.fedoraproject.org/metalink?repo=fedora-$releasever&arch=$basearch

This metalink actually makes yum/dnf pick a "random" server site (picked by the server random geographically by world region according to the location by the client of the metalink).
This also is used in case of slow download to switch to the next better site.

I noticed due to docker builds a lot of downloads, that why i am considering a squid proxy which all machines must use. But this "random" strategy of yum/dnf, worries me. I do understand the intention of fedora/centos to distribute the load of of this free repositories, so actually I do not want to undermine this strategy

Can squid somehow intelligently detect, that the client just uses "another fedora/centos repo url" and intelligently cache this? The metalink list in itself seems to be pretty stable (it just changes the order when asked, but it the list itself seems to be the same).

Intention: Do not store 1000 copies of the same file, only because it is from a different server.

How would i do that with squid?

EDIT: Does somebody have experience using this http://wiki.squid-cache.org/Features/StoreID for caching of dnf/yum?

Best Answer

Answering my own question. Found out that squid has support for handling this kind of problem with the storeid_file_rewrite script. The only tricky thing is to get a valid list of urls, which represent the same repositories. Seems to work fine so far.

Added to squid.conf the following

store_id_program /usr/lib64/squid/storeid_file_rewrite /etc/squid/fedora.db
store_id_access allow localnet
store_id_access deny all

To get the content for the fedora.db (caching fedora 25 at this point in time) is some trickery with getting the urls from the mirrorlist

basearch="x86_64"
releasever=25
mirrorlist="https://mirrors.fedoraproject.org/metalink?repo=fedora-$releasever&arch=$basearc
curl -s "$mirrorlist" >tmp.db

You need to convert the "url" in the "tmp.db" result into the format explained here http://wiki.squid-cache.org/Features/StoreID/DB. This can possibly automated (Any volunteers?)

Then you get something like this as "fedora.db", which is used in squid.conf above.

^http:\/\/ftp\.halifax\.rwth-aachen\.de\/fedora\/linux\/releases\/25\/Everything\/(x86_64\/[a-zA-Z0-9\-\_\.\/]+rpm)$    http://repo.mirrors.squid.internal/fedora/25/$1
^http:\/\/mirror2\.hs-esslingen\.de\/fedora\/linux\/releases\/25\/Everything\/(x86_64\/[a-zA-Z0-9\-\_\.\/]+rpm)$        http://repo.mirrors.squid.internal/fedora/25/$1
^http:\/\/fedora\.tu-chemnitz\.de\/pub\/linux\/fedora\/linux\/releases\/25\/Everything\/(x86_64\/[a-zA-Z0-9\-\_\.\/]+rpm)$      http://repo.mirrors.squid.internal/fedora/25/$1

... much more

EDIT: Alternative, a more dangerous path, but maybe also sufficient, a more global pattern matching like this:

\/fedora\/linux\/releases\/([0-9]+)\/Everything/x86_64\/(.*)$   http://repo.mirrors.squid.internal/fedora/releases/$1/$2
\/fedora\/linux\/updates\/([0-9]+)\/x86_64\/(.*)$       http://repo.mirrors.squid.internal/fedora/updates/$1/$2

Sources:

Related Solutions

Squid not caching

In your config you have missed this lines:

acl myhosts src 192.168.0.0/255.255.0.0 (your internal network/netmask)
http_access allow myhosts

EDIT1:

Your web server is not your cache_peer. Please, remove this line from your config file. Squid has for interoperability between caches another type of protocol (ICP), which apache don't know.

Fedora 14 Repository List

You have a very strange situation. It's really quite bizarre that any vendor would base a "network switch" on Fedora; with its very short 13-month lifecycle it would be very difficult to support. Unless, of course, they made it impossible to install or update anything, which they seem to have done. And that opens you up to security holes...

So if you want to get to a point where you can install software on it, this is what I would recommend.

First, the official Fedora archive site is https://archive.fedoraproject.org/. There you will find the old repositories you are looking for.

What I would do is to download the fedora-release RPM from Fedora 14, and apply that. Then you will have files in /etc/yum.repos.d which you will have to manually update their baseurls to point to the appropriate directories on archive.fedoraproject.org (and remove the mirrorlist or metalink that might have been in there). This should get you to the point where you can install software and apply whatever updates were available for F14.

Also keep in mind that after you do this, the vendor is probably going to tell you to get stuffed...if they're even still in business. Which seems doubtful, if they were making silly business decisions like this.

I wouldn't even think about upgrading it past F14 unless you are prepared to sacrifice whatever functionality the vendor provided with this device. Binary compatibility with their custom software cannot be guaranteed if you upgrade to F15 or higher.

Best Answer

Related Solutions

Squid not caching

Fedora 14 Repository List

Related Topic