LEA or ADD instruction

assemblyx86

When I'm handwriting assembly, I generally choose the form

lea eax, [eax+4]

Over the form..

add eax, 4

I have heard that lea is a "0-clock" instruction (like NOP), while 'add' isn't. However, when I look at compiler produced Assembly I often see the latter form used instead of the first. I'm smart enough to trust the compiler, so can anyone shed some light on which one is better? Which one is faster? Why is the compiler choosing the latter form over the former?

Best Answer

One significant difference between LEA and ADD on x86 CPUs is the execution unit which actually performs the instruction. Modern x86 CPUs are superscalar and have multiple execution units that operate in parallel, with the pipeline feeding them somewhat like round-robin (bar stalls). Thing is, LEA is processed by (one of) the unit(s) dealing with addressing (which happens at an early stage in the pipeline), while ADD goes to the ALU(s) (arithmetic / logical unit), and late in the pipeline. That means a superscalar x86 CPU can concurrently execute a LEA and an arithmetic/logical instruction.

The fact that LEA goes through the address generation logic instead of the arithmetic units is also the reason why it used to be called "zero-clocks"; it takes no time to execute because address generation has already happened by the time it would be / is executed.

It's not free, since address generation is a step in the execution pipeline, but it's got no execution overhead. And it doesn't occupy a slot in the ALU pipeline(s).

Edit: To clarify, LEA is not free. Even on CPUs that do not implement it via the arithmetic unit it takes time to execute due to instruction decode / dispatch / retire and/or other pipeline stages that all instructions go through. The time taken to do LEA just occurs in a different stage of the pipeline for CPUs that implement it via address generation.

Related Solutions

What’s the purpose of the LEA instruction

As others have pointed out, LEA (load effective address) is often used as a "trick" to do certain computations, but that's not its primary purpose. The x86 instruction set was designed to support high-level languages like Pascal and C, where arrays—especially arrays of ints or small structs—are common. Consider, for example, a struct representing (x, y) coordinates:

struct Point
{
     int xcoord;
     int ycoord;
};

Now imagine a statement like:

int y = points[i].ycoord;

where points[] is an array of Point. Assuming the base of the array is already in EBX, and variable i is in EAX, and xcoord and ycoord are each 32 bits (so ycoord is at offset 4 bytes in the struct), this statement can be compiled to:

MOV EDX, [EBX + 8*EAX + 4]    ; right side is "effective address"

which will land y in EDX. The scale factor of 8 is because each Point is 8 bytes in size. Now consider the same expression used with the "address of" operator &:

int *p = &points[i].ycoord;

In this case, you don't want the value of ycoord, but its address. That's where LEA (load effective address) comes in. Instead of a MOV, the compiler can generate

LEA ESI, [EBX + 8*EAX + 4]

which will load the address in ESI.

The difference between MOV and LEA

LEA means Load Effective Address
MOV means Load Value

In short, LEA loads a pointer to the item you're addressing whereas MOV loads the actual value at that address.

The purpose of LEA is to allow one to perform a non-trivial address calculation and store the result [for later usage]

LEA ax, [BP+SI+5] ; Compute address of value

MOV ax, [BP+SI+5] ; Load value at that address

Where there are just constants involved, MOV (through the assembler's constant calculations) can sometimes appear to overlap with the simplest cases of usage of LEA. Its useful if you have a multi-part calculation with multiple base addresses etc.

Best Answer

Related Solutions

What’s the purpose of the LEA instruction

The difference between MOV and LEA

Related Topic