im looking at implementing a distributed arithmetic architecture, but running into some trouble.
From the textbook im looking at we can rewrite the unsigned convolution of two of two vectors c
and x
as
where the function, f, is a LUT to give the partial sum. It is used as so in the following example:
However, when it comes to implementation we dont want to use a barrel shifter to shift b
times every iteration, so the textbook suggests to do the following:
And I believe that this is implemented as shown in the following circuit
But I fail to understand how this circuit, and the above example would yield the same results. Let us suppose the same scenario as the example i.e.:
c[0] = 2
c[1] = 3
c[2] = 1
and
x[0] = 1
x[1] = 3
x[2] = 7
Then from the circuit we would have the following calculation made when calculating the convolution of x and c (where t is the iteration of the shift adder):
t = 0 : y = (0/2) + 6 = 6
t = 1 : y = (6/2) + 4 = 7
t = 2 : y = (7/2) + 1 = 4
and obviously this answer is wrong, we can see that x * c = (2)(1) + (3)(3) + (1)(7) = 18
So, I ask you, where have I misinterpreted the circuit, what is the problem? Thanks for any help that is given.
Best Answer
(Via Reddit)
[–]bunky_bunk
[+1] 3 points 28 minutes ago you do not treat bit 0 of the accumulator as the least significant bit. instead you keep a certain number of bits that would be shifted out and discarded. here the 4 bits before the underscore correspond to your equation, but all 6 bits give the result 18 in the end.
source: https://old.reddit.com/r/FPGA/comments/jz5v2h/implementation_of_distributed_arithmetic/gda0had/