I have an old Joomla! site that I would like to convert to a static set of html pages (since it's not being updated anymore and I don't want the overhead of having a MySQL db running for something that is never updated).
Is there a command-line tool that could basically crawl and download the entire forward-facing website?
Best Answer
I just made static pages from an old Joomla with this command:
It's short version is:
This save pages with
.hml
extension and will get (almost) allcss
,js
and images files the pages need.But I wanted my static mirror to have the same links as the original. So the files names couldn’t have the
.html
extension, which made me remove the-E
option.Then I found that the
-p
option (and-k
) doesn’t work the same way if you don’t use the-E
. But using-E
and-p
still is the best way to get most of thepage-requisites
. So I did a first fetch with it, deleted all.html
files and then fetched all over again without-E
.As option
-k
without-E
also doesn’t convert all links, I had to make some substitutions. The complete list of commands used is:As I was mirroring a site under a path in my domain I ended up with this file:
That
index.html
was deleted by my second command, which is ok. When I ran the secondwget
it created a file with the same name as the directory and anindex.php
inside it, like:...and converted (at least some) home links to
subsite.1
. If all home links were the same, only one of those two files would be needed. Andindex.php
is the best choice as it is automatically served when a client asks forhttp://my.domain.com/subsite
.To solve that I ran:
In the end, using a web developer tool (firebug) I found that there were still missing some files that were included by javascript or by css. I got them one by one.