I'm trying to create a static mirror of a php application (an old php Gallery installation, specifically). The app produces URLs such as:
view_album.php?set_albumName=MyAlbum
wget
downloads these directly to files named the same, complete with question marks. In order to not break inbound links, I'd like to keep those names. But how do I serve them? I'm running into two problems:
-
Webservers (correctly) attempt to find "view_album.php", and pass it the query args, rather than a finding a file with a question mark in it. How do I tell a webserver to look for files with a question mark in them? Renaming the files isn't desirable, as it would break inbound links. I can't tell the inbound linkers to %-encode their URLs.
-
The files don't end with HTML, so most webservers won't send an html content-type header. What configuration parameters should I look for to tell it to force a 'text/html' content-type for all files in a directory or matching a certain pattern?
I'm using lighttpd ultimately, but if you know what sort of configuration might get the desired results with apache/nginx I'd love to hear that too.
Best Answer
You can disable that behavior with
--restrict-file-names=ascii,windows
, this resolves your issue right on wget, without needing fancy server configs.