Prefetch and cache HTTP requests on Squid or any proxy server

PROXYsquid

Is there a proxy server (preferably Squid) or similar software that will let me queue say 100 URLs upfront from a list or API and cache the HTML data transparently (making parallel requests).

Then later when I request one of the URL, it gives a cached version quickly?

Best Answer

With any caching proxy you like, you can script the cache warm-up requests - the proxy will take care of caching the responses according to its policy (make sure to allocate enough storage and set a sufficiently durable retention).

Some script in the spirit of the following example will do just fine for getting a list of URL through the proxy:

#!/bin/bash
http_proxy=http://proxy.exemple.net:3128/; export http_proxy
for my_url in `cat one_url_per_line_file` do
  wget -r -nd --delete-after $my_url
done