Why does GCC use multiplication by a strange number in implementing integer division

assemblycgccinteger-divisionx86-64

I've been reading about div and mul assembly operations, and I decided to see them in action by writing a simple program in C:

File division.c

#include <stdlib.h>
#include <stdio.h>

int main()
{
    size_t i = 9;
    size_t j = i / 5;
    printf("%zu\n",j);
    return 0;
}

And then generating assembly language code with:

gcc -S division.c -O0 -masm=intel

But looking at generated division.s file, it doesn't contain any div operations! Instead, it does some kind of black magic with bit shifting and magic numbers. Here's a code snippet that computes i/5:

mov     rax, QWORD PTR [rbp-16]   ; Move i (=9) to RAX
movabs  rdx, -3689348814741910323 ; Move some magic number to RDX (?)
mul     rdx                       ; Multiply 9 by magic number
mov     rax, rdx                  ; Take only the upper 64 bits of the result
shr     rax, 2                    ; Shift these bits 2 places to the right (?)
mov     QWORD PTR [rbp-8], rax    ; Magically, RAX contains 9/5=1 now, 
                                  ; so we can assign it to j

What's going on here? Why doesn't GCC use div at all? How does it generate this magic number and why does everything work?

Best Answer

Integer division is one of the slowest arithmetic operations you can perform on a modern processor, with latency up to the dozens of cycles and bad throughput. (For x86, see Agner Fog's instruction tables and microarch guide).

If you know the divisor ahead of time, you can avoid the division by replacing it with a set of other operations (multiplications, additions, and shifts) which have the equivalent effect. Even if several operations are needed, it's often still a heck of a lot faster than the integer division itself.

Implementing the C / operator this way instead of with a multi-instruction sequence involving div is just GCC's default way of doing division by constants. It doesn't require optimizing across operations and doesn't change anything even for debugging. (Using -Os for small code size does get GCC to use div, though.) Using a multiplicative inverse instead of division is like using lea instead of mul and add

As a result, you only tend to see div or idiv in the output if the divisor isn't known at compile-time.

For information on how the compiler generates these sequences, as well as code to let you generate them for yourself (almost certainly unnecessary unless you're working with a braindead compiler), see libdivide.

Linking to static libraries

$ g++ -c b.cpp -o b.o
$ ar cr libb.a b.o
$ g++ -c d.cpp -o d.o
$ ar cr libd.a d.o

$ g++ -L. -ld -lb a.cpp # wrong order
$ g++ -L. -lb -ld a.cpp # wrong order
$ g++ a.cpp -L. -ld -lb # wrong order
$ g++ a.cpp -L. -lb -ld # right order

The linker searches from left to right, and notes unresolved symbols as it goes. If a library resolves the symbol, it takes the object files of that library to resolve the symbol (b.o out of libb.a in this case).

Dependencies of static libraries against each other work the same - the library that needs symbols must be first, then the library that resolves the symbol.

If a static library depends on another library, but the other library again depends on the former library, there is a cycle. You can resolve this by enclosing the cyclically dependent libraries by -( and -), such as -( -la -lb -) (you may need to escape the parens, such as -\( and -\)). The linker then searches those enclosed lib multiple times to ensure cycling dependencies are resolved. Alternatively, you can specify the libraries multiple times, so each is before one another: -la -lb -la.

Linking to dynamic libraries

$ export LD_LIBRARY_PATH=. # not needed if libs go to /usr/lib etc
$ g++ -fpic -shared d.cpp -o libd.so
$ g++ -fpic -shared b.cpp -L. -ld -o libb.so # specifies its dependency!

$ g++ -L. -lb a.cpp # wrong order (works on some distributions)
$ g++ -Wl,--as-needed -L. -lb a.cpp # wrong order
$ g++ -Wl,--as-needed a.cpp -L. -lb # right order

It's the same here - the libraries must follow the object files of the program. The difference here compared with static libraries is that you need not care about the dependencies of the libraries against each other, because dynamic libraries sort out their dependencies themselves.

Some recent distributions apparently default to using the --as-needed linker flag, which enforces that the program's object files come before the dynamic libraries. If that flag is passed, the linker will not link to libraries that are not actually needed by the executable (and it detects this from left to right). My recent archlinux distribution doesn't use this flag by default, so it didn't give an error for not following the correct order.

It is not correct to omit the dependency of b.so against d.so when creating the former. You will be required to specify the library when linking a then, but a doesn't really need the integer b itself, so it should not be made to care about b's own dependencies.

Here is an example of the implications if you miss specifying the dependencies for libb.so

$ export LD_LIBRARY_PATH=. # not needed if libs go to /usr/lib etc
$ g++ -fpic -shared d.cpp -o libd.so
$ g++ -fpic -shared b.cpp -o libb.so # wrong (but links)

$ g++ -L. -lb a.cpp # wrong, as above
$ g++ -Wl,--as-needed -L. -lb a.cpp # wrong, as above
$ g++ a.cpp -L. -lb # wrong, missing libd.so
$ g++ a.cpp -L. -ld -lb # wrong order (works on some distributions)
$ g++ -Wl,--as-needed a.cpp -L. -ld -lb # wrong order (like static libs)
$ g++ -Wl,--as-needed a.cpp -L. -lb -ld # "right"

If you now look into what dependencies the binary has, you note the binary itself depends also on libd, not just libb as it should. The binary will need to be relinked if libb later depends on another library, if you do it this way. And if someone else loads libb using dlopen at runtime (think of loading plugins dynamically), the call will fail as well. So the "right" really should be a wrong as well.

Ruby – Why is division in Ruby returning an integer instead of decimal value

It’s doing integer division. You can use to_f to force things into floating-point mode:

9.to_f / 5  #=> 1.8
9 / 5.to_f  #=> 1.8

This also works if your values are variables instead of literals. Converting one value to a float is sufficient to coerce the whole expression to floating point arithmetic.

File division.c

Best Answer

Related Solutions

Why does the order in which libraries are linked sometimes cause errors in GCC

Linking to static libraries

Linking to dynamic libraries

Ruby – Why is division in Ruby returning an integer instead of decimal value

Related Topic