I have no idea what these are actually called, but I see them all the time. The Python implementation is something like:
x += 5
as a shorthand notation for x = x + 5
.
But why is this considered good practice? I've run across it in nearly every book or programming tutorial I've read for Python, C, R so on and so forth. I get that it's convenient, saving three keystrokes including spaces. But they always seem to trip me up when I'm reading code, and at least to my mind, make it less readable, not more.
Am I missing some clear and obvious reason these are used all over the place?
Best Answer
It's not shorthand.
The
+=
symbol appeared in the C language in the 1970s, and - with the C idea of "smart assembler" correspond to a clearly different machine instruction and adressing mode:Things like "
i=i+1
","i+=1
" and "++i
", although at an abstract level produce the same effect, correspond at low level to a different way of working of the processor.In particular those three expressions, assuming the
i
variable resides in the memory address stored in a CPU register (let's name itD
- think of it as a "pointer to int") and the ALU of the processor takes a parameter and return a result in an "accumulator" (let's call it A - think to it as an int).With these constraints (very common in all microprocessors from that period), the translation will most likely be
The first way of doing it is disoptimal, but it is more general when operating with variables instead of constant (
ADD A, B
orADD A, (D+x)
) or when translating more complex expressions (they all boil down in push low priority operation in a stack, call the high priority, pop and repeat until all the arguments had been eliminated).The second is more typical of "state machine": we are no longer "evaluating an expression", but "operating a value": we still use the ALU, but avoid moving values around being the result allowed to replace the parameter. These kind of instruction cannot be used where more complicated expression are required:
i = 3*i + i-2
cannot be operated in place, sincei
is required more times.The third -even simpler- does not even consider the idea of "addition", but uses a more "primitive" (in computational sense) circuitry for a counter. The instruction is shorted, load faster and executes immediately, since the combinatorial network required to retrofit a register to make it a counter is smaller, and hence faster than the one of a full-adder.
With contemporary compilers (refer to C, by now), enabling compiler optimization, the correspondence can be swapped based on convenience, but there is still a conceptual difference in the semantics.
x += 5
meansBut
x = x + 5
means:Of course, optimization can
&x
instead to the accumulatorthus making the optimized code to coincide the
x += 5
one.But this can be done only if "finding x" has no side effects, otherwise
and
are semantically different, since
x()
side effects (admittingx()
is a function doing weird things around and returning anint*
) will be produced twice or once.The equivalence between
x = x + y
andx += y
is hence due to the particular case where+=
and=
are applied to a direct l-value.To move to Python, it inherited the syntax from C, but since there is no translation / optimization BEFORE the execution in interpreted languages, things are not necessarily so intimately related (since there is one less parsing step). However, an interpreter can refer to different execution routines for the three types of expression, taking advantage of different machine code depending on how the expression is formed and on the evaluation context.
For who likes more detail...
Every CPU has an ALU (arithmetic-logical unit) that is, in its very essence, a combinatorial network whose inputs and output are "plugged" to the registers and / or memory depending on the opcode of the instruction.
Binary operations are typically implemented as "modifier of an accumulator register with an input taken "somewhere", where somewhere can be - inside the instruction flow itself (typical for manifest contant: ADD A 5) - inside another registry (typical for expression computation with temporaries: e.g. ADD A B) - inside the memory, at an address given by a register (typical of data fetching e.g.: ADD A (H)) - H, in this case, work like a dereferencing pointer.
With this pseudocode,
x += 5
iswhile
x = x+5
isThat is, x+5 gives a temporary that is later assigned.
x += 5
operates directly on x.The actual implementation depends on the real instruction set of the processor: If there is no
ADD (.) c
opcode, the first code becomes the second: no way.If there is such an opcode, and optimization are enabled, the second expression, after eliminating the reverse moves and adjusted the registers opcode, become the first.