C++ – Conveniently Declaring Compile-Time Strings in C++

cc++11metaprogrammingstringuser-defined-literals

Being able to create and manipulate strings during compile-time in C++ has several useful applications. Although it is possible to create compile-time strings in C++, the process is very cumbersome, as the string needs to be declared as a variadic sequence of characters, e.g.

using str = sequence<'H', 'e', 'l', 'l', 'o', ',', ' ', 'w', 'o', 'r', 'l', 'd', '!'>;

Operations such as string concatenation, substring extraction, and many others, can easily be implemented as operations on sequences of characters. Is it possible to declare compile-time strings more conveniently? If not, is there a proposal in the works that would allow for convenient declaration of compile-time strings?

Why Existing Approaches Fail

Ideally, we would like to be able to declare compile-time strings as follows:

// Approach 1
using str1 = sequence<"Hello, world!">;

or, using user-defined literals,

// Approach 2
constexpr auto str2 = "Hello, world!"_s;

where decltype(str2) would have a constexpr constructor. A messier version of approach 1 is possible to implement, taking advantage of the fact that you can do the following:

template <unsigned Size, const char Array[Size]>
struct foo;

However, the array would need to have external linkage, so to get approach 1 to work, we would have to write something like this:

/* Implementation of array to sequence goes here. */

constexpr const char str[] = "Hello, world!";

int main()
{
    using s = string<13, str>;
    return 0;
}

Needless to say, this is very inconvenient. Approach 2 is actually not possible to implement. If we were to declare a (constexpr) literal operator, then how would we specify the return type? Since we need the operator to return a variadic sequence of characters, so we would need to use the const char* parameter to specify the return type:

constexpr auto
operator"" _s(const char* s, size_t n) -> /* Some metafunction using `s` */

This results in a compile error, because s is not a constexpr. Trying to work around this by doing the following does not help much.

template <char... Ts>
constexpr sequence<Ts...> operator"" _s() { return {}; }

The standard dictates that this specific literal operator form is reserved for integer and floating-point types. While 123_s would work, abc_s would not. What if we ditch user-defined literals altogether, and just use a regular constexpr function?

template <unsigned Size>
constexpr auto
string(const char (&array)[Size]) -> /* Some metafunction using `array` */

As before, we run into the problem that the array, now a parameter to the constexpr function, is itself no longer a constexpr type.

I believe it should be possible to define a C preprocessor macro that takes a string and the size of the string as arguments, and returns a sequence consisting of the characters in the string (using BOOST_PP_FOR, stringification, array subscripts, and the like). However, I do not have the time (or enough interest) to implement such a macro =)

Best Answer

I haven't seen anything to match the elegance of Scott Schurr's str_const presented at C++ Now 2012. It does require constexpr though.

Here's how you can use it, and what it can do:

int
main()
{
    constexpr str_const my_string = "Hello, world!";
    static_assert(my_string.size() == 13, "");
    static_assert(my_string[4] == 'o', "");
    constexpr str_const my_other_string = my_string;
    static_assert(my_string == my_other_string, "");
    constexpr str_const world(my_string, 7, 5);
    static_assert(world == "world", "");
//  constexpr char x = world[5]; // Does not compile because index is out of range!
}

It doesn't get much cooler than compile-time range checking!

Both the use, and the implementation, is free of macros. And there is no artificial limit on string size. I'd post the implementation here, but I'm respecting Scott's implicit copyright. The implementation is on a single slide of his presentation linked to above.