Why does gcc push %rbx at the beginning of main

assemblygcc

The latest version of gcc is producing assembly that doesn't make sense to me. I compiled the code using no optimization; but, some parts of this code don't make sense, even with no optimization.

Here is the C source:

  #include <stdio.h>

   int main()
   {
     int a = 1324;
     int b = 5657;
     int difference = 9876;
     int printf_answer = 2221;

     difference = a - b;

     printf_answer = printf("%d + %d = %d\n", a, b, difference);

     return difference;
   }

It produces this assembly:

    .file   "exampleIML-1b.c"
    .section    .rodata
.LC0:
    .string "%d + %d = %d\n"
    .text
.globl main
    .type   main, @function
main:
.LFB0:
    .cfi_startproc
    pushq   %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset 6, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register 6
    pushq   %rbx
    subq    $24, %rsp
    movl    $1324, -32(%rbp)
    movl    $5657, -28(%rbp)
    movl    $9876, -24(%rbp)
    movl    $2221, -20(%rbp)
    movl    -28(%rbp), %eax
    movl    -32(%rbp), %edx
    movl    %edx, %ecx
    subl    %eax, %ecx
    movl    %ecx, %eax
    movl    %eax, -24(%rbp)
    movl    $.LC0, %eax
    movl    -24(%rbp), %ecx
    movl    -28(%rbp), %edx
    movl    -32(%rbp), %ebx
    .cfi_offset 3, -24
    movl    %ebx, %esi
    movq    %rax, %rdi
    movl    $0, %eax
    call    printf
    movl    %eax, -20(%rbp)
    movl    -24(%rbp), %eax
    addq    $24, %rsp
    popq    %rbx
    leave
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc
.LFE0:
    .size   main, .-main
    .ident  "GCC: (GNU) 4.4.6 20120305 (Red Hat 4.4.6-4)"
    .section    .note.GNU-stack,"",@progbits

Several things don't make sense:

(1) Why are we pushing %rbx? What is in %rbx that needs to be saved?

(2) Why are we moving %edx to %ecx before subtracting? What doesn't it just do sub %eax, %edx?

(3) Similarly, why the move from %ecx back to %eax before storing the value?

(4) The compiler is putting the variable a in memory location -32(%rbp). Unless I'm adding wrong, isn't -32(%rbp) equal to the stack pointer? Shouldn't all local variables be stored at values less than the current stack pointer?

I'm using this version of gcc:

[eos17:~/Courses/CS451/IntelMachineLanguage]$ gcc -v
Using built-in specs.
Target: x86_64-redhat-linux
Configured with: ../configure –prefix=/usr –mandir=/usr/share/man –infodir=/usr/share/info –with-bugurl=http://bugzilla.redhat.com/bugzilla –enable-bootstrap –enable-shared –enable-threads=posix –enable-checking=release –with-system-zlib –enable-__cxa_atexit –disable-libunwind-exceptions –enable-gnu-unique-object –enable-languages=c,c++,objc,obj-c++,java,fortran,ada –enable-java-awt=gtk –disable-dssi –with-java-home=/usr/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre –enable-libgcj-multifile –enable-java-maintainer-mode –with-ecj-jar=/usr/share/java/eclipse-ecj.jar –disable-libjava-multilib –with-ppl –with-cloog –with-tune=generic –with-arch_32=i686 –build=x86_64-redhat-linux
Thread model: posix
gcc version 4.4.6 20120305 (Red Hat 4.4.6-4) (GCC)

Best Answer

GCC dictates how the stack is used. Contract between caller and callee on x86:

    * after call instruction:
          o %eip points at first instruction of function
          o %esp+4 points at first argument
          o %esp points at return address 
    * after ret instruction:
          o %eip contains return address
          o %esp points at arguments pushed by caller
          o called function may have trashed arguments
          o %eax contains return value (or trash if function is void)
          o %ecx, %edx may be trashed
          o %ebp, %ebx, %esi, %edi must contain contents from time of call 
    * Terminology:
          o %eax, %ecx, %edx are "caller save" registers
          o %ebp, %ebx, %esi, %edi are "callee save" registers

The main function is like any other function in this context. gcc decided to use ebx for intermediate calculations, so it preserves its value.

Linking to static libraries

$ g++ -c b.cpp -o b.o
$ ar cr libb.a b.o
$ g++ -c d.cpp -o d.o
$ ar cr libd.a d.o

$ g++ -L. -ld -lb a.cpp # wrong order
$ g++ -L. -lb -ld a.cpp # wrong order
$ g++ a.cpp -L. -ld -lb # wrong order
$ g++ a.cpp -L. -lb -ld # right order

The linker searches from left to right, and notes unresolved symbols as it goes. If a library resolves the symbol, it takes the object files of that library to resolve the symbol (b.o out of libb.a in this case).

Dependencies of static libraries against each other work the same - the library that needs symbols must be first, then the library that resolves the symbol.

If a static library depends on another library, but the other library again depends on the former library, there is a cycle. You can resolve this by enclosing the cyclically dependent libraries by -( and -), such as -( -la -lb -) (you may need to escape the parens, such as -$ and -$). The linker then searches those enclosed lib multiple times to ensure cycling dependencies are resolved. Alternatively, you can specify the libraries multiple times, so each is before one another: -la -lb -la.

Linking to dynamic libraries

$ export LD_LIBRARY_PATH=. # not needed if libs go to /usr/lib etc
$ g++ -fpic -shared d.cpp -o libd.so
$ g++ -fpic -shared b.cpp -L. -ld -o libb.so # specifies its dependency!

$ g++ -L. -lb a.cpp # wrong order (works on some distributions)
$ g++ -Wl,--as-needed -L. -lb a.cpp # wrong order
$ g++ -Wl,--as-needed a.cpp -L. -lb # right order

It's the same here - the libraries must follow the object files of the program. The difference here compared with static libraries is that you need not care about the dependencies of the libraries against each other, because dynamic libraries sort out their dependencies themselves.

Some recent distributions apparently default to using the --as-needed linker flag, which enforces that the program's object files come before the dynamic libraries. If that flag is passed, the linker will not link to libraries that are not actually needed by the executable (and it detects this from left to right). My recent archlinux distribution doesn't use this flag by default, so it didn't give an error for not following the correct order.

It is not correct to omit the dependency of b.so against d.so when creating the former. You will be required to specify the library when linking a then, but a doesn't really need the integer b itself, so it should not be made to care about b's own dependencies.

Here is an example of the implications if you miss specifying the dependencies for libb.so

$ export LD_LIBRARY_PATH=. # not needed if libs go to /usr/lib etc
$ g++ -fpic -shared d.cpp -o libd.so
$ g++ -fpic -shared b.cpp -o libb.so # wrong (but links)

$ g++ -L. -lb a.cpp # wrong, as above
$ g++ -Wl,--as-needed -L. -lb a.cpp # wrong, as above
$ g++ a.cpp -L. -lb # wrong, missing libd.so
$ g++ a.cpp -L. -ld -lb # wrong order (works on some distributions)
$ g++ -Wl,--as-needed a.cpp -L. -ld -lb # wrong order (like static libs)
$ g++ -Wl,--as-needed a.cpp -L. -lb -ld # "right"

If you now look into what dependencies the binary has, you note the binary itself depends also on libd, not just libb as it should. The binary will need to be relinked if libb later depends on another library, if you do it this way. And if someone else loads libb using dlopen at runtime (think of loading plugins dynamically), the call will fail as well. So the "right" really should be a wrong as well.

C++ – the difference between g++ and gcc

gcc and g++ are compiler-drivers of the GNU Compiler Collection (which was once upon a time just the GNU C Compiler).

Even though they automatically determine which backends (cc1 cc1plus ...) to call depending on the file-type, unless overridden with -x language, they have some differences.

The probably most important difference in their defaults is which libraries they link against automatically.

According to GCC's online documentation link options and how g++ is invoked, g++ is equivalent to gcc -xc++ -lstdc++ -shared-libgcc (the 1st is a compiler option, the 2nd two are linker options). This can be checked by running both with the -v option (it displays the backend toolchain commands being run).

Best Answer

Related Solutions

Why does the order in which libraries are linked sometimes cause errors in GCC

Linking to static libraries

Linking to dynamic libraries

C++ – the difference between g++ and gcc

Related Topic