Operation excution in terms of clock cycles

assemblycomputer-architectureconceptscpumachine-code

Typically for a single instrcution, 6 machine cycles are needed:

  1. FETCH instruction
  2. DECODE instruction
  3. EVALUATE ADDRESS
  4. fetch OPERANDS
  5. EXECUTE oepration
  6. STORE result

My concern is regarding the fifth step; excute operation. This is done in the ALU which is simply somehow a group of digital circuits which do ADD, MUL, XADD, …etc.

My question: Is the time taken (in terms of clock cycles) to excute an ADD for example, equal to that taken to excute XADD? I mean are the digital circuits for each individual operation designed in a way to consume the same number of clock cycles?

In other words, is the machine cylce time fixed?

Best Answer

In most cases, yes, cycle time for each stage is fixed. There are some exceptions, depending on processor. But the description you give is vastly over-simplified. Modern processors are organised in pipelines, so that one stage of execution of one instruction can occur at the same time as others. While some processors use a 6-stage pipeline like you describe, they are a small minority. Most modern processors split the operation into many more stages, each of which takes once cycle. For example, Intel Core processors of the current generation have 19 stages, each of which takes a single cycle. In some circumstances an instruction may skip one of them. Usually, multiple instructions are executed in different stages simultaneously, but some instructions in some circumstances will prevent other operations progressing (e.g. branch mispredictions, or if no instructions are ready because they need to wait for data that has not been produced yet). Also, the processor core may have multiple pipelines so multiple instructions run completely in parallel, and in some architectures not all pipelines are capable of execution of all instruction types. Instruction fetch and decoding is shared between all pipelines, and in many cases can handle many instructions per cycle. In modern processors based on CISC instructions like Intel x86 the instructions are translated into RISC-like micro instructions before execution, so one program instruction may translate to multiple instructions in the pipeline (or vice versa). Determining the actual performance in real world situations is extremely difficult.

Related Topic