History of Why Bytes Are Eight Bits

bitbytehardwarehistory

What where the historical forces at work, the tradeoffs to make, in deciding to use groups of eight bits as the fundamental unit ?

There were machines, once upon a time, using other word sizes, but today for non-eight-bitness you must look to museum pieces, specialized chips for embedded applications, and DSPs. How did the byte evolve out of the chaos and creativity of the early days of computer design?

I can imagine that fewer bits would be ineffective for handling enough data to make computing feasible, while too many would have lead to expensive hardware. Were other influences in play? Why did these forces balance out to eight bits?

(BTW, if I could time travel, I'd go back to when the "byte" was declared to be 8 bits, and convince everyone to make it 12 bits, bribing them with some early 21st Century trinkets.)

Best Answer

A lot of really early work was done with 5-bit baudot codes, but those quickly became quite limiting (only 32 possible characters, so basically only upper-case letters, and a few punctuation marks, but not enough "space" for digits).

From there, quite a few machines went to 6-bit characters. This was still pretty inadequate though -- if you wanted upper- and lower-case (English) letters and digits, that left only two more characters for punctuation, so most still had only one case of letters in a character set.

ASCII defined a 7-bit character set. That was "good enough" for a lot of uses for a long time, and has formed the basis of most newer character sets as well (ISO 646, ISO 8859, Unicode, ISO 10646, etc.)

Binary computers motivate designers to making sizes powers of two. Since the "standard" character set required 7 bits anyway, it wasn't much of a stretch to add one more bit to get a power of 2 (and by then, storage was becoming enough cheaper that "wasting" a bit for most characters was more acceptable as well).

Since then, character sets have moved to 16 and 32 bits, but most mainstream computers are largely based on the original IBM PC. Then again, enough of the market is sufficiently satisfied with 8-bit characters that even if the PC hadn't come to its current level of dominance, I'm not sure everybody would do everything with larger characters anyway.

I should also add that the market has changed quite a bit. In the current market, the character size is defined less by the hardware than the software. Windows, Java, etc., moved to 16-bit characters long ago.

Now, the hindrance in supporting 16- or 32-bit characters is only minimally from the difficulties inherent in 16- or 32-bit characters themselves, and largely from the difficulty of supporting i18n in general. In ASCII (for example) detecting whether a letter is upper or lower case, or converting between the two, is incredibly trivial. In full Unicode/ISO 10646, it's basically indescribably complex (to the point that the standards don't even try -- they give tables, not descriptions). Then you add in the fact that for some languages/character sets, even the basic idea of upper/lower case doesn't apply. Then you add in the fact that even displaying characters in some of those is much more complex still.

That's all sufficiently complex that the vast majority of software doesn't even try. The situation is slowly improving, but slowly is the operative word.

Related Topic