The worst case scenario for a Ripple-Carry Adder (RCA) is when the LSB generates a carry out, and the carry ripples through the entire adder from bit 0 to bit (N - 1). An example pattern would be 00000001 + 11111111. In adder terminology, bits 7-1 are "Propagators", and bit 0 is a "Generator". The critical path is from the carry-out of the LSB to the carry-out of the MSB, and every adder is in the critical path.
The idea behind a Carry-Skip Adder (CSA) is to reduce the length of this critical path by giving the carry path a shortcut if all bits in a block would propagate a carry. A block-wide propagate signal is fairly easy to compute, and each block can calculate its own propagate signal simultaneously. So the worst case is still the same scenario, but what happens looks a bit different.
Lets say we still have the same problem of 0000......001 + 0111.....111. The first block will calculate a carry in the first bit, and will propagate the carry through bits 1, 2, and 3. At this point, the first block carry-out signal is valid. The propagate select signals are already valid, since it is 2-3 gate delays and the carry signal is 4 gate delays. The carry-in multiplexer for bits 8-11 gets the carry signal from the carry-out of bit 3 since bits 4-7 would propagate a carry. Note that this takes 1 gate delay, while a normal RCA would take 4 gate delays. Each block will add 1 gate delay to the carry signal.
If the MSB killed carry propagation, then that would cause the last CSA block to ripple carry the input, which would take another 4 gate delays. This setup of a LSB generate and a MSB kill is the new worst case. The source of the critical path is the same between the RCA and CSA, but the critical path is different.
If an arbitrary block generated a carry by itself, the carry will always propagate to the next block. However, if the second block generates a carry itself, or kills the carry, than that is the end of the critical path. If the second block propagates the carry, then we see the advantage of the CSA architecture.
Also, when the term "critical path" is used, it generally implies that you are considering a set of inputs that will cause the worst-case delay. Your scenarios that you are providing give "ugly" cases that may have large delay, but it isn't the largest delay.
Best Answer
Ripple carry and carry lookahead adders are combinatorial circuits - they do not hold state and they are not clocked. So counting clock cycles is meaningless, unless perhaps you are talking about some sort of a pipelined implementation. Generally the whole point of using something like a carry lookahead adder is that the addition operation will be completed within a single clock cycle. With registers feeding the inputs and capturing the outputs, the question becomes one of what's the fastest clock at which the adder will work. To calculate this, all you have to do is find the worst-case propagation delay along the worst-case critical path and factor in the setup time and clock to output delay of the registers.
Note that it is also possible to treat an adder (or any other combinatorial logic for that matter) as a multicycle path. This means that the input will be held constant for N clock cycles, and the output will be captured on the last clock cycle. This allows the clock to run faster than the logic, at the expense of adding wait states. This is inferior to pipelining as in a pipelined setup, you can get a new result every clock cycle with a latency of one clock cycle per pipeline stage. However, it can be difficult to pipeline certain logic functions and pipelining does require inserting extra registers which will consume more area, power, etc.