ASCII vs UTF-8 – Advantages of Choosing ASCII Encoding

asciicharacter encodingutf-8

All characters in ASCII can be encoded using UTF-8 without an increase in storage (both requires a byte of storage).

UTF-8 has the added benefit of character support beyond "ASCII-characters". If that's the case, why will we ever choose ASCII encoding over UTF-8?

Is there a use-case when we will choose ASCII instead of UTF-8?

Best Answer

In some cases it can speed up access to individual characters. Imagine string str='ABC' encoded in UTF8 and in ASCII (and assuming that the language/compiler/database knows about encoding)

To access third (C) character from this string using array-access operator which is featured in many programming languages you would do something like c = str[2].

Now, if the string is ASCII encoded, all we need to do is to fetch third byte from the string.

If, however string is UTF-8 encoded, we must first check if first character is a one or two byte char, then we need to perform same check on second character, and only then we can access the third character. The difference in performance will be the bigger, the longer the string.

This is an issue for example in some database engines, where to find a beginning of a column placed 'after' a UTF-8 encoded VARCHAR, database does not only need to check how many characters are there in the VARCHAR field, but also how many bytes each one of them uses.

Related Topic