Nearly all CPUs have a single instruction that will return the modulus of a value. For instance, consider this program:
int main()
{
int i = 10;
return i % 3;
}
If I compile this on my Intel OS X machine using g++ -S, the result will be some boilerplate and this:
movl $3, %eax
movl $0, -4(%rbp)
movl $10, -8(%rbp)
movl -8(%rbp), %ecx
movl %eax, -12(%rbp) ## 4-byte Spill
movl %ecx, %eax
cltd
movl -12(%rbp), %ecx ## 4-byte Reload
idivl %ecx
movl %edx, %eax
The actual modulus happens with this instruction: idivl %ecx
. When this instruction executes, the quotient will be placed in in %eax
and the remainder will be placed in in %edx
.
Since really, this means the %
operation is only going to take a few clock cycles, the bulk of which is actually moving the data into the right register. Also note that with Intel at least, the same operation finds both the quotient and the remainder, so in reference to your comment, /
and %
take exactly the same time. It is the same operation. The only thing that changes is what register the compiler gets the results from.
With any CPU made in the last couple decades, you can assume that any basic mathematical operation (including things that look like library functions like sqrt
or cos
) is actually a single machine instruction and generally only take a few clock cycles at most.
[UPDATE]
As people have noted in the comments, to see something approaching a correct timing, you need to remove the output from the timed section like this:
int i;
auto start = Clock::now();
i = 20 % 3;
auto end = Clock::now();
cout << i << endl;
But even that is likely not accurate as the actual granularity of the timings may exceed what you are trying to time. Instead, you might want to do this:
int i=0;
int x;
auto start = Clock::now();
for(i=0;i<1000000;i++)
x = i % 3;
auto end = Clock::now();
cout << i << endl;
Then divide your timing by 1,000,000. (This will be slightly high as it includes the time taken for an assignment, a comparison and an increment.) On my machine, this gives a time of 5 nanoseconds.
Best Answer
A slightly more correct statement would be that compilers wouldn't set aside data memory for const objects of integer type: they would trade it for program memory. There is no difference between the two under Von Neumann architecture, but in other architectures, such as Harvard, the distinction is rather important.
To fully understand what's going on, you need to recall how assembly language loads data for processing. There are two fundamental ways to load the data - read a memory from a specific location (so called direct addressing mode), or set a constant specified as part of the instruction itself (so called immediate addressing mode). When compiler sees a
const int x = 5
declaration followed byint a = x+x
, it has two options:x
is referenced, generate an immediate load instruction of the value 5In the first case you will see a read from
x
into the accumulator register, an addition of the value at the location ofx
to accumulator, and a store into the location ofa
. In the second case you will see a load immediate of five, an add immediate of five, followed by a store into the location ofa
. Some compilers may figure out that you are adding a constant to itself, optimizea = x+x
intoa = 10
, and generate a single instruction that stores ten at the location ofa
.