It's not the fact that it's tri-state that makes the difference, it's the fact that it's a buffer. Buffers are generally designed to supply more current than a normal logic stage.
They still have a fanout limit; it's just a higher limit. What that limit is, you will have to learn from the datasheets for your logic family and your buffer.
A 74LS00 input can consume 20 uA pulled high, or 0.4ma pulled low.
The 74LS00 gate can source 400 uA and sink 8 ma, 20x as much as the input current, so its fanout is 20.
But the 74LS240 driver can source 15ma and sink 24ma, so it can pull 60 TTL inputs low, and its fanout is therefore 60.
If one buffer isn't enough you can connect 2 or more in parallel (up to the normal fanout for your logic stage) each driving a different set of gates.
You might well ask why it can source 15ma instead of just 60 * 20ua = 1.2ma. That's because buffers also have to drive relatively long wires with high capacitance, and doing so at any speed takes current in both directions (pulling both high and low). If you need 15ma to drive the cable fast enough, you would also have to reduce the fanout (number of inputs it can pull low) or live with slightly lower speed.
You might also ask why I'm using a logic family about as old as Abba : that's because modern CMOS logic requires so little input current that any fanout number is likely to be in the thousands; other issues like the wire speed problem matter much more.
Fanout is still there in the background and shouldn't be totally forgotten, but any logic course that gives it much attention probably needs updating!
ECL outputs are referenced to the most positive supply rail. This means that any noise appearing on the most positive supply rail will be directly coupled onto the output signal. For example, if the power supply is 5V and GND, then all outputs would be referenced to 5V, and any noise on the 5V supply would also be seen at the ECL outputs. Therefore, the older literature calls for ECL chips to be powered by a negative power supply, such as -5V and GND. By using GND as the most positive supply rail, it is easier to maintain cleaner signals at the ECL outputs. GND is generally found to be less noisy as compared to a non-GND potential. This does not mean it is impossible to use a positive power supply. It does mean that precautions must be taken to ensure very little noise is coupled onto the most positive supply rail in order to maintain clean outputs
Best Answer
What we need
As a rule, the authors of circuit textbooks willingly show us how circuits are made... and how to calculate them. But they frequently forget to tell us why they are made this way; so usually we have to find the explanation ourselves.
The challenge
It is a big challenge to reveal the "philosophy" behind these legendary complementary stages... to find the answer to the question, "Why are they made exactly this way?".. and to the more specific question here, "Why not a source follower?" I will try to do it in a human friendly manner, without any special terms and definitions that impede the intuitive understanding at this initial stage.
The OP's idea
It really makes sense. He simply asks, "Why do we need to make a follower by two cascaded CMOS inverters (4 transistors in total) when such a simpler circuit (CMOS follower of only 2 transistors) exists?"
Really, it exists... and it is widely used in analog amplifiers... but here we are talking about digital circuits (logic gates). Let's consider what is the difference between them.
Follower vs inverter
Following vs amplification. The output voltage of the CMOS follower is a copy of the input voltage. So, if the input signal has poor (sloping) transitions, the output signal will also be poor.
In contrast, the CMOS inverter has significant gain during the switching because each of the transistors acts as a "dynamic load" to the other. As a result, the input signal is amplified and its transitions become steep. So the CMOS inverter improves the input signal.
The fact that logic gates are amplifiers makes it possible to build latches by introducing a positive feedback (simply by connecting the output of the cascaded inverters to the input). It is impossible to make this by the source follower because its gain is less than one.
Output voltage drops. Analog circuits work in the middle range of the power supply (in active mode); their output voltage does not reach supply rails (ground and VDD). So, the voltage drops across the drain-source parts of both transistors can be significant... and they can be connected in a CMOS source follower configuration.
In contrast, digital circuits work close to supply rails; their output voltage is either 0 V (ground) or VDD (+5 V). This means that the voltage drops across the drain-source parts of both transistors should be almost zero... and they should be connected in a CMOS inverter configuration. So, the CMOS inverter provides voltage levels almost equal to the supply rails.
Input voltage thresholds. The voltage follower needs small voltage thresholds of the transistors since they determine the difference between the input and output voltage (i.e., here the voltage threshold is something undesired). That is why, BJT are more suitable for this configuration since their base-emitter voltage VBE (0.7 V) is a relatively small threshold.
In contrast, the complementary inverter needs significant voltage thresholds (but still < VDD/2) since both transistors switch close to middle (i.e., here the voltage threshold is desired). So, this topology can not be implemented by BJT because of their small thresholds. MOS FETs are more suitable for the inverting configuration because of their high gate-source voltage threshold Vth.
Biasing. Another problem of the follower is the absence of biasing. As a result, in a region of 2Vth, both transistors ate cut-off and the output is "floating". There is no such a problem in the invertor where at least one transistor is on.
Versatility. Cascaded inverters have another (great) advantage vs the follower - there is another input (output). It is used in RS latches and RAM cells.