Doesn't get more complex then the question above. I've been looking through textbooks and using google and I feel like the explanations are not suffice or unclear. Can anyone explain this to me well?
Electronic – Can anyone explain to me how a carry select adder works
adder
Related Solutions
The worst case scenario for a Ripple-Carry Adder (RCA) is when the LSB generates a carry out, and the carry ripples through the entire adder from bit 0 to bit (N - 1). An example pattern would be 00000001 + 11111111. In adder terminology, bits 7-1 are "Propagators", and bit 0 is a "Generator". The critical path is from the carry-out of the LSB to the carry-out of the MSB, and every adder is in the critical path.
The idea behind a Carry-Skip Adder (CSA) is to reduce the length of this critical path by giving the carry path a shortcut if all bits in a block would propagate a carry. A block-wide propagate signal is fairly easy to compute, and each block can calculate its own propagate signal simultaneously. So the worst case is still the same scenario, but what happens looks a bit different.
Lets say we still have the same problem of 0000......001 + 0111.....111. The first block will calculate a carry in the first bit, and will propagate the carry through bits 1, 2, and 3. At this point, the first block carry-out signal is valid. The propagate select signals are already valid, since it is 2-3 gate delays and the carry signal is 4 gate delays. The carry-in multiplexer for bits 8-11 gets the carry signal from the carry-out of bit 3 since bits 4-7 would propagate a carry. Note that this takes 1 gate delay, while a normal RCA would take 4 gate delays. Each block will add 1 gate delay to the carry signal.
If the MSB killed carry propagation, then that would cause the last CSA block to ripple carry the input, which would take another 4 gate delays. This setup of a LSB generate and a MSB kill is the new worst case. The source of the critical path is the same between the RCA and CSA, but the critical path is different.
If an arbitrary block generated a carry by itself, the carry will always propagate to the next block. However, if the second block generates a carry itself, or kills the carry, than that is the end of the critical path. If the second block propagates the carry, then we see the advantage of the CSA architecture.
Also, when the term "critical path" is used, it generally implies that you are considering a set of inputs that will cause the worst-case delay. Your scenarios that you are providing give "ugly" cases that may have large delay, but it isn't the largest delay.
Let's say that we want to do a good job of testing this, but without going through the entire 2^32 space of possible operands. (It is not possible for such adder to have such a bug that it only affects a single combination of operands, requiring an exhaustive search of the 2^32 space, so it is inefficient to test it that way.)
If the individual adders are working correctly, and the ripple propagation between them works correctly, then it is correct.
I would giver priority to some test cases which focus on stressing the carry rippling, since the adders have been individually tested.
My first test case would be adding 1 to 1111..1111 which causes a carry out of every bit. The result should be zero, with a carry out of the highest bit.
(Every test case should be tried over both commutations: A + B and B + A, by the way.)
The next set set of test cases would be adding 1 to various "lone zero" patterns like 011...111, 1011...11, 110111..111, ..., 1111110. The presence of a zero should "eat" the carry propagation correctly at that bit position, so that all bits in the result which are lower than that position are zero, and all higher bits are 1 (and, of course, there is no final carry out of the register).
Another set of test cases would add these "lone 1" power-of-two bit patterns to various other patterns: 000...1, 0000...10, 0000...100, ..., 1000..000. For instance, if this is added to the operand 1111.1111, then all bits from that bit position to the left should clear, and all the bits below that should be unaffected.
Next, a useful test case might be to add all of the 16 powers of two (the "lone 1" vectors), as well as zero, to each of the 65536 possible values of the opposite operand (and of course, commute and repeat).
Finally, I would repeat the above two "lone 1" tests with "lone 11": all bit patterns which have 11 embedded in 0's, in all possible positions. This way we are hitting the situations that each adder is combining two 1 bits and a carry, requiring it to produce 1 and carry out 1.
Best Answer
Consider the gate depth (worst case delay path) for an N-bit Ripple-Carry Adder = 3*N (ref), because it's just a sequence of N Full Adders. The claim in its Wikipedia entry is that the Carry-Select topology gate depth is on the order of sqrt(N). The advantage is only realized in "blocking" the computation, and calculating parallel paths in each block under the assumption that the carry in has a particular value. Each block can do its sums and have stable results ready to choose from without having to wait for the carry inputs from the preceding block. Since each block is calculated in parallel, so you only have to wait for the carry values for each block to be calculated, and for those effective carry outputs to propagate "ripple style" through the blocks. For a 16-bit adder with 4-bit blocks, you only suffer the worst case path of 3 * sqrt(16) gate delays for each block to calculate all possible outcomes, and an additional sqrt(16) delays for the block carries to propagate to make the correct selections, so you end up with something like sqrt(16) + 3 * sqrt(16) delays = 4 * sqrt(16) = O(sqrt(N)) delays. I think that's about right, give or take a constant multiple here or there. Good luck with your exam :).