Electronic – Confusion over binary radix usage and formatting through FIR filter (and circuits in general)

binaryfilterfirvhdl

I'm having a bit of a hard time trying to get my head around binary radix's. Specifically when it comes to use them in a circuit. On their own I can understand them fine. For example, 2s complement, fixed point, BCD etc..

This is where I'm getting confused.

I've been building a FIR filter in VHDL and have come to the point where I have to implement the coefficients.
Each coefficient is below 1 and is 9 bits. The numbers are signed fixed point numbers. The first 8 bits are the fractional part with the 9th bit the sign bit / integer bit.

Now my problem is: now that I have chosen a format (say, 8 bits for fractional part of the number), does that mean every other number I choose to input into the system have to follow the same radix? Fixed point with 8 fractional bits?

As what I'm being told is, when you input an impulse response to the filter the output should be each coefficient in order. When I use "0000000001" as the input then yes I do get each coefficient on the output. But I don't understand how. I understand that a '1' is getting clocked through each stage and being multiplied with each coefficient on each clock but it doesn't represent a "1" in the same format or radix as my coefficients. A true 1 would be "0100000000" as the first 8 bits are fractional.

I'm having a hard time getting my head around the number side of system, the structure and how it's supposed to work.

Is there something wrong with my understanding?

Best Answer

Let's suppose you have a coefficient and a signal input value. If the coefficient has \$F_C\$ fraction bits and the input has \$F_I\$ fraction bits then their product will have \$F_C + F_I \$ fraction bits. When you used 000000001 to represent the integer 1 you had implicitly set \$F_I = 0\$ so the products had the same format as the coefficients. If you use fixed-point values that are \$\ge 1.0\$ then you will need bits to the left of the binary point to represent the integer part of the value. As with the fraction bits, the number of integer bits in the product will equal the sum of the numbers of integer bits in the multiplier and multiplicand.

When you add fixed-point values they must have the same number of fraction bits (i.e. the binary point is aligned) and the sum will have the same number of fraction bits as the addends. If you don't have information about the actual range of values for the sum then you need to assume that a carry can occur, so you need an additional bit to the left of the binary point to represent the integer part of the number. That is, you need one more integer bit in the sum than the maximum number of integer bits in either of the addends.