According to David Harris's presentation for eve224a course: (slides 6-11 and 47)
Delay d = f+p = g*h+p
Where d is process-independent delay, f is effort delay (stage effect), p is parasitic delay, g is logical effort, h is electrical effort (fanout; h = C_out/C_in)
In the Wikipedia article "Logical Effort" there are some examples too:
Delay in an inverter. By definition, the logical effort g of an inverter is 1
Delay in NAND and NOR gates. The logical effort of a two-input NAND gate is calculated to be g = 4/3
For NOT gate with FO1 (driving the same NOT gate):
g=1; h=1; p=1; so d = 1*1 + 1 = 2
For NOT gate with FO4 (the FO4 metric itself):
g=1; h=4 (Cout is 4 times more than Cin); p=1 so d = 1*4+1 =5 (the same result is at page 20 of books "Logical Effort: Designing Fast CMOS Circuits", draft from 1998)
1 FO4 delay is equal to 5 process-independent units (defined by harris, slide 6)
For NAND gate with two inputs (p=2) which drives the same:
g=4/3; h=1; p=2; d= 4/3 * 1 + 2 = 10/3 = 3,3 (a 1.5 times slower than NOT with FO1, but faster than NOT FO4)
For NAND gate asked by me - 2 inputs which drives 3 same NANDs:
g=4/3; h=3; p=2; d= (some magic inside) 4/3 * 3 + 2 = 6
So
Delay of 1 FO4 gate is equal to 5/6 delay of NAND (2-in, 3 FO).
The last problem is to convert chain delay of 18 NANDs to chain delay of FO4. (slide 41 of harris)
Hmm.. seems I need only to multiply 18 NANDs delay with 6/5... 21,6 FO4.
Thanks!
It really depends on how the gate is constructed. For a completely accurate simulation, you have to do a transistor-level analog simulation. However, it is possible to extract timing parameters from a transistor-level simulation and abstract them out a bit. The output rise and fall times and propagation delays will depend on the input rise and fall times, output load capacitance, power supply voltage, temperature, and the state of the inputs. Yes, it is possible for the same input transitioning to have a propagation delay that depends on the state of the other inputs. These techniques are used in the timing models used in ASIC and FPGA design in both static timing analysis as well as timing-driven place and route.
Fundamentally, the propagation delay is determined by how long it takes for the output to transition in response to a change at the input. This depends on exactly how the gate is built at a transistor level. For a single two transistor CMOS inverter, the propagation delay is determined by the analog electrical characteristics of the transistors and their parasitic capacitance. The input will slew at some rate, then once the threshold is reached the output will start slewing. If the input changes before the output finishes slewing, then the output will start to slew back the other way and you will end up with a highly distorted output. So for a single inverter, the output for a change faster than the propagation delay would be an invalid logic level (i.e. x). However, "gates" can be far more complicated than a single inverter. For example, if you have a "gate" that is built from a string of 100,000 inverters, then the propagation delay of the whole unit will be 100,000 times the propagation delay of a single inverter and it is certainly possible to have multiple transitions 'in flight' at the same time, so long as these transitions are not faster than each individual inverter can handle.
Best Answer
I can only answer in the context of standard CMOS logic gates.
For NOR and NAND gates, as the number of inputs increases you also increase the number of transistors that are connected in series. NAND gates of \$N\$ inputs have \$N\$ NMOS transistors in series while NOR gates have \$N\$ PMOS transistors in series. The series transistors have essentially the same effect as series resistors...they increase the time required to change the voltage on the load capacitance. Now you could increase the width of the series transistors to compensate for this, but that would increase the input capacitance of the gate and just move the problem to the previous logic stage.
Increasing the number of inputs for a NAND or NOR gate also increases the number of transistors that are connected in parallel, with all of their drains connected to the logic gate output. This increases the internal parasitic capacitance of the gate, further exacerbating the slower transition time. More wiring is needed to connect all of these capacitors, so even more parasitic internal capacitance.
So, if the propagation delay is proportional to \$R\times C\$, and increasing the number inputs increases both \$R\$ and \$C\$ then you could argue that the transition time is proportional to \$N^2\$. I don't think the relationship is quite that simple (not precisely \$N^2\$) but it is certainly worse than linear.