To answer those numbered questions:
1) subl $24,%esp
means esp = esp - 24
GNU AS uses AT&T syntax, which is the opposite of Intel syntax. AT&T has the destination on the right, Intel has the destination on the left. Also AT&T is explicit about the size of the arguments. Intel tries to deduce it or forces you to be explicit.
The stack grows down in memory, the memory at and after esp is the stack contents, addresses lower than esp are unused stack space. esp points to the last thing pushed onto the stack.
2) x86 instruction encoding mostly allows the following:
movl rm,r ' move value from register or memory to a register
movl r,rm ' move a value from a register to a register or memory
movl imm,rm ' Move immediate value.
there is no memory-to-memory instruction format. (Strictly speaking you can do memory-to-memory operations with movs
or by push mem
, pop mem
, but neither take two memory operands on the same instruction)
"Immediate" means the value is encoded right into the instruction. For example, to store 15 at the address in ebx:
movl $15,(%ebx)
15 is an "immediate" value.
The parentheses make it use the register as a pointer to memory.
3) movl 8(%ebp),%eax
means,
- take the value of ebp
- add 8 to it (does not modify ebp though),
- use it as an address (the parentheses),
- read the 32-bit value from that address,
- and store the value in eax
esp is the stack pointer.
In 32-bit mode, each push and pop on the stack is 4 bytes wide. Typically, most variables take up the 4 bytes anyway. So you could say 8(%ebp) means, starting at the top of stack, give me the value 2 (4 x 2 = 8) int's into the stack.
Typically, 32-bit code uses ebp to point to the beginning of the local variables in a function. In 16-bit x86 code, there was no way to use the stack pointer as a pointer (hard to believe, right?). So what people did was copy sp
to bp
and use bp as the local frame pointer. This became completely unnecessary when 32-bit mode came out (80386), it did have a way to just use the stack pointer directly. Unfortunately, ebp makes debugging easier so we ended up continuing to use ebp in 32-bit code (it's trivially easy to make a stack dump if ebp is being used).
Thankfully, amd64 gave us a new ABI which does not use ebp as a frame pointer, 64-bit code typically uses esp to access local variables, ebp is available to hold a variable.
4) Explained above
5) leave
is an old instruction that simply does movl %ebp,%esp
and popl %ebp
and saves a few code bytes. What it actually does is undo the changes to the stack and restore the caller's ebp. The called function must preserve ebp
in the x86 ABI.
On entry to the function, the compiler did subl $24,%esp to make room for local variables and sometimes temp storage that it didnt have enough registers to hold.
The best way to "imagine" the stack frame in your mind is to see it as a structure sitting on the stack. The first members of the imaginary structure are the most recently "pushed" values. So when you push to a stack, imagine inserting a new member at the beginning of the structure, while none of the other members moved. When you "pop" from the stack, you get the value of the first member of the imaginary struct, and that (first) line of the structure disappears from existence.
Stack frame manipulation is mostly just moving the stack pointer to make more or less room in that imaginary struct we call the stack frame. Subtracting from the stack pointer just puts multiple imaginary members at the start of the struct in one step. Adding to the stack pointer makes the first so many members disappear.
The end of the code you posted is not typical. That jmp
is typically a ret
. The compiler was clever about it and did a "tail call optimization", meaning it just cleans up what it did to the stack and jumps to f
. When f(2)
returns, it will actually return straight to the caller (not back to the code you posted)
GS is a segment register, its use in linux can be read up on here (its basically used for per thread data).
mov %gs:0x14,%eax
xor %gs:0x14,%eax
this code is used to validate that the stack hasn't exploded or been corrupted, using a canary value stored at GS+0x14, see this.
gcc -fstack-protector=strong
is on by default in many modern distros; you can use gcc -fno-stack-protector
to not add those checks. (On x86, thread-local storage is cheap so GCC keeps the randomized canary value there, making it somewhat harder to leak.)
Best Answer
enter
is avoided in practice as it performs quite poorly - see the answers at "enter" vs "push ebp; mov ebp, esp; sub esp, imm" and "leave" vs "mov esp, ebp; pop ebp". There are a bunch of x86 instructions that are obsolete but are still supported for backwards compatibility reasons -enter
is one of those. (leave
is OK though, and compilers are happy to emit it.)Implementing nested functions in full generality as in Python is actually a considerably more interesting problem than simply selecting a few frame management instructions - search for 'closure conversion' and 'upwards/downwards funarg problem' and you'll find many interesting discussions.
Note that the x86 was originally designed as a Pascal machine, which is why there are instructions to support nested functions (
enter
,leave
), thepascal
calling convention in which the callee pops a known number of arguments from the stack (ret K
), bounds checking (bound
), and so on. Many of these operations are now obsolete.