How to Write a Generalized String Reverse Function for All Localizations and String Types

algorithmslocalizationstringsunicode

I was just watching the Jon Skeet (with Tony the Pony) presentation from Dev-Days.

Although "write a string reverse function" is coding interview 101 – I'm not sure that it's actually possible to write a general string reverse function, certainly not one that works in all localisations and all string types.

Apart from detecting if the input string is ascii, UTF8, UTF16 (fixed and variable length) etc.
There is the 'apply accent to next character' (U+0301) code that Jon highlighted.
Then there are ligatures that may or may not be displayed, or encoded as double characters.

Seems that "reverse a string" is actually one of the harder computer science tasks!

Best Answer

Yes. If we get a string we can definately reverse each character.

The problem as Jon points out is that does the reversal make sense and does it conform to language and cultural rules, characters, and encoding. The water gets murky the deeper you go.

If you are doing any type of string manipulation in C# use the Invariant culture when writing and reading, that way you can safely manipulate them. Otherwise, prepare for the Turkish support call failure.

ToUpper() looks so innocent, but its is an epic fail waiting to happen.