How to configure UTF-8 Content-Type header in Apache

apache-2.4encodinghttp-headers

My website has pages and other content with UTF-8 encoding. For HTML, setting the encoding in a meta tag is no problem. However, I also have raw text files with UTF-8 encoding that aren't displayed correctly, such as appearing as ×. I've considered adding a byte-order mark at the start of such files, but I'd prefer not to since they aren't always well supported. I followed the instructions in this other question, but it had no effect. This is the HTTP response header:

HTTP/1.1 200 OK
Date: Sat, 12 Aug 2017 15:41:04 GMT
Server: Apache/2.4.10 (Debian)
Last-Modified: Wed, 09 Aug 2017 19:24:33 GMT
ETag: "c04c-5565707a34966"
Accept-Ranges: bytes
Content-Length: 49228
Keep-Alive: timeout=5, max=100
Connection: Keep-Alive

I was hoping to see Content-Type: text/plain; charset=utf-8. How can I get reliable UTF-8 encoding for these URIs?

Best Answer

Content-Type is not sent for 304 Modified responses because there is no content body for such a response.

Look at the 200 response type and you should see this. Use Ctrl + F5 to force a refresh and a 200 response rather than revalidating the cached response with a 304 response.

You then updated your question to include a 200 response, but I would expect that always to have a Content-Type: text/plain header or equivalent (even if the character set is not included) but that is not in your example, so not sure you have all the details in that?

Regardless, the correct way to set this is to add the following to your apache config:

#Set the correct Char set so don't need to set it per page.
AddDefaultCharset utf-8
#for css, js, etc.
AddCharset utf-8 .htm .html .js .css

The first (AddDefaultCharset) will set the charset for text/plain and text/html resources.

The second (AddCharset) requires mod_mime and will set the charset for other types based on file extension. Javascript files are sent with content type of application/javascript and CSS files are sent with content type of text/css so are not picked up by the AddDefaultCharset setting. The .htm and .html files don't really need to be in this as will be picked up by default but no harm being explicit.