C++ Strings – Compiling for String and Wstring

cstringsunicodevisual studio

I'm creating a library. I want to use it in multiple projects which may use multi-byte or unicode (std::string or std::wstring). I've adopted the old MS method of conditional compiling:

namespace my_namespace {
#ifdef UNICODE
    typedef std::wstring String;
    typedef std::wstringstream StringStream;
    #define Str(s) L##s
#else
    typedef std::string String;
    typedef std::stringstream StringStream;
    #define Str(s) s
#endif
}

(The Str macro is for string literals. VC++ marks wide strings with L. Example: L"this is a wide string";)

Are there better ways to accomplish this?

Best Answer

The old Microsoft technique

The good old Microsoft technique has served millions of applications, so it is definitively to be considered as a valuable and proven approach.

Three remarks:

  • Microsoft uses this conditional compilation not only the few core elements (TCHAR, TEXT, ...), but also for a lot of other string related functions (see example in the MSDN article) in order for this to work consistently.

  • You have to be careful about the combination of macros with namespaces. For example Str() looks like a normal function, but it is a macro defined globally and not limited to your namespace (and to be used without namespace prefix). I'd suggest to use capitals to make this explicit

  • If you start now a new code base, I'd suggest to adopt Meyer's recommendation to prefer type alias over typedef.

Less redundant variant

As in C++ string/wstring, stringstream/wstringstream, etc... are only char/wchar_t specializations of basic_string<X>/basic_stringstream<X>, I'd define the types to be used based on the underlying character type that you want:

namespace mine {
#ifdef UNICODE
    using Char = wchar_t; 
    #define Str(s) L##s
#else
    using Char = char; 
    #define Str(s) s
#endif
    using String = std::basic_string<Char>;
    using StringStream = std::basic_stringstream<Char>;
    // ...  a lot more but only once
}

Demo

If needed, you could then easily switch to char32_t if you'd wanted to work with full 32 bits unicode across all platforms (currently wchar_t on windows is 16 bits and uses UTF16 encoding, whereas on linuts it's 32 bit and UTF32) as you could using u32string).

Conditional compilation

In theory you could imagine a runtime decision whether to run unicode or not. But to achieve this you'd need to create all objects using an abstract factory. This seems very painful and complex. Not speaking of the code bloat having every string function in double.

Another approach could be to use some templates to define the types at compile time using some clever template. But ultimately you'd need to rely on some macro, that you could define in your build scripts to automate building of all the versions. As in the end you'd rely on them, why not facilitate the approach and using them for what they are supposed to do !