Is this a proper Content-Type header

apache-2.2http-headers

I have a pretty good understanding of the Content-Type header for most cases. I understand that for the following four examples, you would normally follow the MIME-type with charset=your-charset-here.

Content-Type "text/plain; charset=utf-8"
Content-Type "text/html; charset=utf-8"
Content-Type "text/javascript; charset=utf-8"
Content-Type "text/xml; charset=utf-8"

… and with images, no charset:

Content-Type "image/gif"
Content-Type "image/x-icon"
etc.

But what about these two? Should they or shouldn't they include the charset?

Content-Type "application/x-javascript"
Content-Type "application/xml"

I realize it's okay if they don't include the charset, but I would like to include it, if it's possible. They are just text-based files, after all.

Best Answer

Content-Type "text/xml; charset=utf-8"

This is redundant. For XML, the <?xml?> declaration takes precedence over the Content-Type header. If the XML Declaration is omitted you've got UTF-8 anyway.

I would normally leave the charset out for XML. Given that XML has its own perfectly good inline character encoding mechanism, the Content-Type header is unneeded and can only get in the way by accidentally choosing the wrong type for files without an encoding specified that are treated as UTF-8 everywhere else.

The one time you do need a charset parameter for XML is when you're serving a non-ASCII-compatible character set, usually UTF-16, where otherwise the parser wouldn't get as far as reading <?xml. But it's pretty rare you'd ever want to do that. UTF-16 isn't a great file storage/over-the-wire format.

Content-Type "application/xml"

The application/xml media type is specified by RFC3023, and a charset parameter has been explicitly defined for it. So you can use charset if you want (though as per the above, I generally don't want).

Content-Type "application/x-javascript"

Is an unofficial type so there is no specification to say whether a charset parameter exists or what it might do. This type should probably be avoided in favour of text/javascript (traditional) or application/javascript (defined by RFC4329).

In practice, setting a charset on your JavaScript resources isn't necessarily enough, as IE completely ignores it.

Summary of the precedence (highest to lowest) given to scripting character set mechanisms:

  • IE: <script charset> attribute, charset of parent page

  • Opera: charset of script file, charset of parent page

  • Mozilla, Webkit: charset of script file, <script charset> attribute, charset of parent page