Electronic – How does the CPU handle XCHG internally


While designing my own 16-bit CPU I wonder how the register-register XCHG instruction is executed internally. From computer science I know the DLX which doesn't provide XCHG and therefore doesn't need to access two registers in write-back (would this even be possible?), but only the destination register.
I guess this isn't done in one cycle, right?

Thx in advance

Best Answer

You have a couple of options.

You can add the extra hardware in the data path to allow it to occur in one cycle. This has difficulties in a pipelining architecture because a dual port register file is often used for simultaneous reads and writes for the different stages. This adds the need for a second write port. Without this, there really is not a way to prevent a bubble in the pipeline from eventually having to occur.

A generally better option is to simply have a multicycle instruction. The important thing here is to make sure you prevent any other operation such as interrupts or other bus masters (in the case of a memory swap) from make this appear non-atomic.

The multicycle instruction option is what is generally done. For example, the ARM SWP and the XCHG instructions are multicycle.