Do any languages / compilers utilize the x86 ENTER instruction with a nonzero nesting level

assemblyx86

Those familiar with x86 assembly programming are very used to the typical function prologue / epilogue:

push ebp ; Save old frame pointer.
mov  ebp, esp ; Point frame pointer to top-of-stack.
sub  esp, [size of local variables]
...
mov  esp, ebp ; Restore frame pointer and remove stack space for locals.
pop  ebp
ret

This same sequence of code can also be implemented with the ENTER and LEAVE instructions:

enter [size of local variables], 0
...
leave
ret

The ENTER instruction's second operand is the nesting level, which allows multiple parent frames to be accessed from the called function.

This is not used in C because there are no nested functions; local variables have only the scope of the function they're declared in. This construct does not exist (although sometimes I wish it did):

void func_a(void)
{
    int a1 = 7;

    void func_b(void)
    {
        printf("a1 = %d\n", a1);  /* a1 inherited from func_a() */
    }

    func_b();
}

Python however does have nested functions which behave this way:

def func_a():
    a1 = 7
    def func_b():
        print 'a1 = %d' % a1      # a1 inherited from func_a()
    func_b()

Of course Python code isn't translated directly to x86 machine code, and thus would be unable (unlikely?) to take advantage of this instruction.

Are there any languages which compile to x86 and provide nested functions? Are there compilers which will emit an ENTER instruction with a nonzero second operand?

Intel invested a nonzero amount of time/money into that nesting level operand, and basically I'm just curious if anyone uses it 🙂

References:

Best Answer

enter is avoided in practice as it performs quite poorly - see the answers at "enter" vs "push ebp; mov ebp, esp; sub esp, imm" and "leave" vs "mov esp, ebp; pop ebp". There are a bunch of x86 instructions that are obsolete but are still supported for backwards compatibility reasons - enter is one of those. (leave is OK though, and compilers are happy to emit it.)

Implementing nested functions in full generality as in Python is actually a considerably more interesting problem than simply selecting a few frame management instructions - search for 'closure conversion' and 'upwards/downwards funarg problem' and you'll find many interesting discussions.

Note that the x86 was originally designed as a Pascal machine, which is why there are instructions to support nested functions (enter, leave), the pascal calling convention in which the callee pops a known number of arguments from the stack (ret K), bounds checking (bound), and so on. Many of these operations are now obsolete.

Related Solutions

What does subl do here

To answer those numbered questions:

1) subl $24,%esp

means esp = esp - 24

GNU AS uses AT&T syntax, which is the opposite of Intel syntax. AT&T has the destination on the right, Intel has the destination on the left. Also AT&T is explicit about the size of the arguments. Intel tries to deduce it or forces you to be explicit.

The stack grows down in memory, the memory at and after esp is the stack contents, addresses lower than esp are unused stack space. esp points to the last thing pushed onto the stack.

2) x86 instruction encoding mostly allows the following:

movl rm,r   ' move value from register or memory to a register
movl r,rm   ' move a value from a register to a register or memory
movl imm,rm ' Move immediate value.

there is no memory-to-memory instruction format. (Strictly speaking you can do memory-to-memory operations with movs or by push mem, pop mem, but neither take two memory operands on the same instruction)

"Immediate" means the value is encoded right into the instruction. For example, to store 15 at the address in ebx:

movl $15,(%ebx)

15 is an "immediate" value.

The parentheses make it use the register as a pointer to memory.

3) movl 8(%ebp),%eax

means,

take the value of ebp
add 8 to it (does not modify ebp though),
use it as an address (the parentheses),
read the 32-bit value from that address,
and store the value in eax

esp is the stack pointer. In 32-bit mode, each push and pop on the stack is 4 bytes wide. Typically, most variables take up the 4 bytes anyway. So you could say 8(%ebp) means, starting at the top of stack, give me the value 2 (4 x 2 = 8) int's into the stack.

Typically, 32-bit code uses ebp to point to the beginning of the local variables in a function. In 16-bit x86 code, there was no way to use the stack pointer as a pointer (hard to believe, right?). So what people did was copy sp to bp and use bp as the local frame pointer. This became completely unnecessary when 32-bit mode came out (80386), it did have a way to just use the stack pointer directly. Unfortunately, ebp makes debugging easier so we ended up continuing to use ebp in 32-bit code (it's trivially easy to make a stack dump if ebp is being used).

Thankfully, amd64 gave us a new ABI which does not use ebp as a frame pointer, 64-bit code typically uses esp to access local variables, ebp is available to hold a variable.

4) Explained above

5) leave is an old instruction that simply does movl %ebp,%esp and popl %ebp and saves a few code bytes. What it actually does is undo the changes to the stack and restore the caller's ebp. The called function must preserve ebp in the x86 ABI.

On entry to the function, the compiler did subl $24,%esp to make room for local variables and sometimes temp storage that it didnt have enough registers to hold.

The best way to "imagine" the stack frame in your mind is to see it as a structure sitting on the stack. The first members of the imaginary structure are the most recently "pushed" values. So when you push to a stack, imagine inserting a new member at the beginning of the structure, while none of the other members moved. When you "pop" from the stack, you get the value of the first member of the imaginary struct, and that (first) line of the structure disappears from existence.

Stack frame manipulation is mostly just moving the stack pointer to make more or less room in that imaginary struct we call the stack frame. Subtracting from the stack pointer just puts multiple imaginary members at the start of the struct in one step. Adding to the stack pointer makes the first so many members disappear.

The end of the code you posted is not typical. That jmp is typically a ret. The compiler was clever about it and did a "tail call optimization", meaning it just cleans up what it did to the stack and jumps to f. When f(2) returns, it will actually return straight to the caller (not back to the code you posted)

Linux – %gs in Assembly

GS is a segment register, its use in linux can be read up on here (its basically used for per thread data).

mov    %gs:0x14,%eax
xor    %gs:0x14,%eax

this code is used to validate that the stack hasn't exploded or been corrupted, using a canary value stored at GS+0x14, see this.

gcc -fstack-protector=strong is on by default in many modern distros; you can use gcc -fno-stack-protector to not add those checks. (On x86, thread-local storage is cheap so GCC keeps the randomized canary value there, making it somewhat harder to leak.)

Best Answer

Related Solutions

What does subl do here

Linux – %gs in Assembly

Related Topic