I'm porting an isapi (pageproducers) application from delphi 7 to delphi 2009, the pages are based on html files in UTF8.
Everything goes well except when Onhtmltag is fired and I replace a transparent tag with any value with special characters like accented characters (áé…) Those characters are replaced in the output with an � character.
What's wrong?
Best Answer
As part of your debugging procedure, you should go find out exactly what byte value(s) the browser receives for the question-mark character.
As you should know, Delphi 2009's string type is Unicode, whereas all previous version were ANSI. Delphi 7 introduced the
Utf8String
type, but Delphi 2009 made that type special. If you're not using that type for holding strings that are encoded as UTF-8, then you should start doing so. Values held inUtf8String
variables will be converted toUnicodeString
values automatically when you assign one to the other.If you're storing your UTF-8-encoded strings in ordinary
AnsiString
variables, then they will be converted to Unicode using the default system code page if you assign them to aUnicodeString
. That's not what you want.If you're assigning UTF-8-encoded literals to variables of type
string
, stop that. That type expects its values to be encoded as UTF-16, just likeWideString
always has.If you are loading your files into a
TStrings
descendant withLoadFromFile
, then you need to start using that method's second parameter, which tells it what encoding to use. UTF-8-encoded files should useTEncoding.UTF8
. The default isTEncoding.Unicode
, which is little-endian UTF-16.