How to serve a wget –mirror’ed directory of files with questionmarks in them

lighttpdwget

I'm trying to create a static mirror of a php application (an old php Gallery installation, specifically). The app produces URLs such as:

view_album.php?set_albumName=MyAlbum

wget downloads these directly to files named the same, complete with question marks. In order to not break inbound links, I'd like to keep those names. But how do I serve them? I'm running into two problems:

  1. Webservers (correctly) attempt to find "view_album.php", and pass it the query args, rather than a finding a file with a question mark in it. How do I tell a webserver to look for files with a question mark in them? Renaming the files isn't desirable, as it would break inbound links. I can't tell the inbound linkers to %-encode their URLs.

  2. The files don't end with HTML, so most webservers won't send an html content-type header. What configuration parameters should I look for to tell it to force a 'text/html' content-type for all files in a directory or matching a certain pattern?

I'm using lighttpd ultimately, but if you know what sort of configuration might get the desired results with apache/nginx I'd love to hear that too.

Best Answer

wget downloads these directly to files named the same, complete with question marks.

You can disable that behavior with --restrict-file-names=ascii,windows, this resolves your issue right on wget, without needing fancy server configs.

Related Topic