How to generate “language-safe” UUIDs


I always wanted to use randomly generated strings for my resources' IDs, so I could have shorter URLs like this: /user/4jz0k1

But I never did, because I was worried about the random string generation creating actual words, eg: /user/f*cker. This brings two problems: it might be confusing or even offensive for users, and it could mess with the SEO too.

Then I thought all I had to do was to set up a fixed pattern like adding a number every 2 letters. I was very happy with my 'generate_safe_uuid' method, but then I realized it was only better for SEO, and worse for users, because it increased the ratio of actual words being generated, eg: /user/g4yd1ck5

Now I'm thinking I could create a method 'replace_numbers_with_letters', and check that it haven't formed any words against a dictionary or something.

Any other ideas?

ps. As I write this, I also realized that checking for words in more than one language (eg: english and french, spanish, etc) would be a mess, and I'm starting to love numbers-only IDs again.


A couple of tips that will lower the chances of inadvertently creating meaningful words:

  • Add some non-alpha, non-numerical characters to the mix, such as "-", "!" or "_".
  • Compose your UUIDs by accumulating sequences of characters (rather than single characters) that are unlikely to occur in real words, such as "zx" or "aa".

This is some C# sample code (using .NET 4):

private string MakeRandomString()  
    var bits = new List<string>()  
            //keep going with letters.  
            //keep going with numbers.  
            //add some more non-alpha, non-numeric characters.  
            //add some more odd combinations to the mix.  

    StringBuilder sb = new StringBuilder();  
    Random r = new Random();  
    for (int i = 0; i < 8; i++)  

    return sb.ToString();  

This doesn't guarantee that you won't offend anyone, but I agree with @DeadMG that you cannot aim so high.

