Electronic – Why long pipelines are preferred over high CPU clock

clock-speedcpuspeed

Why modern CPU vendors prefers increasing the maximum length of the pipeline over increasing the clock?

Accordingly my understanding, increasing either increases the number of gate switches and so increases the energy dissipation.

So if both provide the same effect, how one is better than another?

Increasing the pipeline even has a disadvantage: it may be hard to use the full pipeline length in all components of a CPU, so it is wasted.

Best Answer

That's because there is a limit to the clock frequency but not to the pipeline length.

When you make a digital circuit the maximum clock frequency is determined by the so called "critical path": you have a clocked register, some combinational logic and another clocked register. If your clock period is shorter than the time required from the combinational logic to output a valid result then you have a problem. This requirement must be met for each possible input, so in the logic there usually is a longer path, the critical path, that is the most time consuming one. Have a look here:

schematic

simulate this circuit – Schematic created using CircuitLab

Note that I omitted the clock signal for clarity (and laziness).

Assuming that each nand and each nor have the same propagation delay \$t_D\$, while the AND1 is built with two nand having a propagation delay of \$2t_D\$ you can see that the longest path is from REG1 (or REG2) to REG4 through AND1 and NOR2, the total delay being \$T_{D_{max}}=3t_D\$. Your clock can't be faster than this, so that $$f_{CK}\leq\frac{1}{T_{D_{max}}}$$

Now imagine a 64bit floating point multiplier. That can have a critical path that is way, way longer than this. What can we do about that?

  1. Keep a low clock frequency, saving power and relaxing the constraints on some other, non critical transistors
  2. Break the critical path with another D flip flop

The second option would make the longest path shorter, thus allowing higher clock frequencies.

You might say well then, why don't they just make shorter critical paths rising the clock frequency? That's because transistors work well till a certain frequency that can not be exceeded, after that you can't increas clock frequency anymore but you can add pipeline stages to make more computations per clock cycle. Moreover, distributing a faster clock is way more difficult than a somewhat lower one, and the compilers nowadays can make a very, very smart use of the pipeline to don't waste any clock cycle.

I'd add that a modern superscalar processor such as the one heating the enviroment in your (and my) pc is way, way more complicated than a "pipeline vs frequency" battle. I suggest this site, there are some white papers and quite a number of talks about a new architecture, so the guy makes a lot of comparisons with a modern CPU. And a plus: the speaker is Gandalf.