Electronic – About Code Density and its definition


I have a conceptual question: What means "high" code density? and why is it so important?

Best Answer

Code density refers loosely to how many microprocessor instructions it takes to perform a requested action, and how much space each instruction takes up. Generally speaking, the less space an instruction takes and the more work per instruction that a microprocessor can do, the more dense its code is.

I notice that you've tagged your question with the 'arm' tag; I can illustrate code density using ARM instructions.

Let's say you want to copy a block of data from one place in memory to another. Conceptually, your high level code would look something like this:

void memcpy(void *dest, void *source, int count_bytes)
    char *s, *d;

    s = source; d = dest;
    while(count_bytes--) { *d++ = *s++; }

Now a simple compiler for a simple microprocessor may convert this to something like the following:

movl r0, count_bytes
movl r1, s
movl r2, d
loop: ldrb r3, [r1]
strb [r2], r3
movl r3, 1
add r1, r3
add r2, r3
sub r0, r3
cmp r0, 0
bne loop

(my ARM is a little rusty, but you get the idea)

Now this would be a very simple compiler and a very simple microprocessor, but you can see from the example that we're looking at 8 instructions per iteration of the loop (7 if we move the '1' to another register and move the load outside the loop). That's not really dense at all. Code density also affects performance; if your loops are longer because the code is not dense, you might need more instruction cache to hold the loop. More cache means a more expensive processor, but then again complex instruction decoding means more transistors to decipher the requested instruction, so it's a classic engineering problem.

ARM's pretty nice in this respect. Every instruction can be conditional, most instructions can increment or decrement the value of registers, and most instructions can optionally update the processor flags. On ARM and with a moderately useful compiler, the same loop may look something like this:

movl r0, count_bytes
movl r1, s
movl r2, d
loop: ldrb r3, [r1++]
strb [r2++], r3
subs r0, r0, 1
bne loop

As you can see, the main loop is now 4 instructions. The code is more dense because each instruction in the main loop does more. This generally means that you can do more with a given amount of memory, because less of it is used to describe how to perform the work.

Now native ARM code often had the complaint that it wasn't super-dense; this is due to two main reasons: first, 32 bits is an awfully "long" instruction, so a lot of bits seem to be wasted for simpler instructions, and second, code got bloated due to ARM's nature: each and every instruction is 32 bits long, without exception. This means that there are a large number of 32-bit literal values that you can't just load into a register. If I wanted to load "0x12345678" into r0, how do I code an instruction that not only has 0x12345678 in it, but also describes "load literal to r0"? There are no bits left over to code the actual operation. The ARM load literal instruction is an interesting little beast, and the ARM assembler must also be a little smarter than normal assemblers, because it has to "catch" these kinds of instructions and code them as a value stored in the object file and an indirect load of that address to the requested register.

Anyway, to answer these complaints, ARM came up with Thumb mode. Instead of 32 bits per instruction, the instruction length is now 16 bits for almost all instructions, and 32 bits for branches. There were a few sacrifices with Thumb mode, but by and large these sacrifices were easy to make because Thumb got you something like a 40% improvement in code density just by reducing the instruction length.