I'm going to list of bunch of "filters that don't overshoot".
I hope you'll find this partial answer better than no answer at all.
Hopefully people looking for "a filter that doesn't overshoot" will find this list of such filters helpful.
Perhaps one of these filters will work adequately in your application, even if we haven't found the mathematically optimum filter yet.
first and second order LTI causal filters
The step response of a first order filter ("RC filter") never overshoots.
The step response of a second order filter ("biquad") can be designed such that it never overshoots.
There are several equivalent ways of describing this class of second-order filter that doesn't overshoot on a step input:
- it is critically damped or it is overdamped.
- it is not underdamped.
- the damping ratio (zeta) is 1 or more
- the quality factor (Q) is 1/2 or less
- the decay rate parameter (alpha) is at least the undamped natural angular frequency (omega_0) or more
In particular, a unity gain Sallen–Key filter topology with equal capacitors and equal resistors is critically damped: Q = 1/2 , and therefore does not overshoot on a step input.
A second-order Bessel filter is slightly underdamped: Q = 1/sqrt(3) , so it has a little overshoot.
A second-order Butterworth filter is more underdamped: Q = 1/sqrt(2) , so it has more overshoot.
Out of all possible first-order and second-order LTI filters that are causal and do not overshoot, the one with the "best" (steepest) frequency response are the "critically damped" second-order filters.
higher-order LTI causal filters
The most commonly-used higher-order causal filter that has an impulse response that is never negative (and therefore never overshoots on a step input) is the "running average filter", also called the "boxcar filter" or the "moving average filter".
Some people like to run data through one boxcar filter, and the output from that filter into another boxcar filter.
After a few such filters, the result is a good approximation of the Gaussian filter.
(The more filters you cascade, the closer the final output approximates a Gaussian, no matter what filter you start with -- boxcar, triangle, first-order RC, or any other -- because of the central limit theorem).
Practically all window functions have an impulse response that is never negative, and so in principle can be used as FIR filters that never overshoot on a step input.
In particular, I hear good things about the Lanczos window,
which is the central (positive) lobe of the sinc() function (and zero outside that lobe).
A few pulse shaping filters have an impulse response that is never negative, and so can be used as filters that never overshoot on a step input.
I don't know which of these filters is the best for your application, and I suspect the mathematically optimum filter may be slightly better than any of them.
non-linear causal filters
The median filter is a popular non-linear filter that never overshoots on a step-function input.
EDIT: LTI noncausal filters
The function sech(t) = 2/( e^(-t) + e^t ) is its own Fourier transform, and I suppose could be used as a kind of non-causal low-pass LTI filter that never overshoots on a step input.
The non-causal LTI filter that has the (sinc(t/k))^2 impulse response has a "abs(k)*triangle(k*w)" frequency response.
When given a step input, it has a lot of time-domain ripple, but it never overshoots the final settling point.
Above the high-frequency corner of that triangle, it gives perfect stop-band rejection (infinite attenuation).
So in the stop band region, it has better frequency response than a Gaussian filter.
Therefore I doubt the Gaussian filter gives the "optimal frequency response".
In the set of all possible "filters that don't overshoot", I suspect there is no one single "optimal frequency response" -- some have better stop-band rejection, while others have narrower transition bands, etc.
You can notionally build as many stages as you want with a single amplifier, and AFAIR I have seen a 5 stage design implemented just to make the point BUT it becomes increasingly hard to "realise" (= construct) as you add stages around a single amplifier. To obtain the correct ratios of components requires increasingly precise component values and increasingly stable components. Capacitors are hard to get with extremely high precision and resistors are only slightly better. For a two stage or 3 stage design you can in most cases manage with 1% parts. Beyond that, the fun begins.
Note: "Pole" used generally here rather than saying "pole or zero as is applicable ..." in each case.
While you will notionally get the same result from a bandpass filter by cascading stages in any order, you will find that in limiting cases aspects such as stage Q and signal magnitude will have some effect. The same applies to stage order in a multiple stage low or high pass.
Your circuits are unusual in separately providing gain for the amplifier. This is acceptable, but the norm is to use a unity gain buffer in this application - amplifier Vout connected to amplifier inverting input. The addition of gain will also affect filter Q and you will end up not realising a classic filter polynomial if you alter the gain - assuming the designer implemented a 'proper' filter in the first place. In the case of the multipole design, varying the gain arbitrarily as shown will influence the "shape" of the resultant response rather than just its amplitude.
For one and two pole designs that need a unity gain buffer, you can use a 1 transistor emitter follower with usually acceptable results. As shown below, the results with a transistor with relatively low gain are inferior to results usually available from an opamp, but can still be very useful..
The above diagram is from this extremely good page -
Elliott Sound products: Active filters - Characteristics, Topologies, Examples
Lots more on the above, and related, here - Gargoyle search.
Best Answer
\$\color{red}{\text{I made a mistake and I need to redo parts of the answer!}}\$ My apologies, however, the mistake is not that bad, since not all the answer is wrong. It bugged me why the group delay is not flat for a 13th order, or why the impulse response doesn't have a minor dip (since Bessel is not a Gaussian). I realized the mistake was already hinted at when I said I used the unsorted poles for the Bessel, which caused the transfer function to go wrong (though not by much). This means that I need to redo the part of the answer that deals with the comparison between the Bessel and the pole placement method; everything else is fine. Again, I am sorry for the mistake. Feel free to downvote, if you will.
Similar, though you'll not get away with just that, there is no closed form solution since they're found only based on the Bessel polynomials (i.e. root-finding). The poles are placed on an elipse, as Andy mentions, but with an offset in the right-hand side. Here's for N=13 for example (upper half):
Still, since the generating polynomial is fixed, i.e. only frequency scaling is needed, then the poles are also fixed and can be generated a priori, for a table (as an easier solution).
For clarification, here's the generating polynomial:
$$s^{13}+91*s^{12}+4095*s^{11}+120120*s^{10}+2552550*s^9+41351310*s^8+523783260*s^7+5237832600*s^6+41247931725*s^5+252070693875*s^4+1159525191825*s^3+3794809718700*s^2+7905853580625*s+7905853580625$$
and here are the poles (unsorted):
for comparison, the poles of a Chebyshev with 0.01dB ripple:
Also, for comparison, the Bessel poles with a circle and the Chebyshev poles, both scaled for better comparison:
Note that the ellipse in the case of Chebyshev is aligned with the greater axis along the Y-axis, while the Bessel poles align themselves with the smaller axis on the Y-axis, while also having an offset.
I remember one book claiming that the poles lie on a circle, displaced to the right, and that they share the same angles as the Butterworth, but projected onto this circle. I used now an N=35 (odd for the extra, single real pole), with a circle, with proportional X and Y axis, but still scaled for better comparison:
The circle is scaled (both X and Y) by 37, and displaced to the right by 37-max(realpart(sBessel)). As you can see, the curves differ. One time I tried what you're asking now, by trying to approximate with a 90o rotated cosh() curve -- close, but no cigar, as they say. Here's a comparison:
I simply resigned and, years after, this question was asked on dsp.se (warning: lengthy post). I'm afraid that, sometimes, there just is no holy grail. In this case, you're stuck with the generating formula for the polynomial:
$$a_k=\frac{(2N-k)!}{2^{N-k}k!(N-k)!}$$
which can get "fluffy" for \$k\rightarrow 0\$, so the recursive one can get you a bit further, but with minor rounding issues:
$$\frac{a_{k+1}}{a_k}=\frac{2(N-k)}{(2N-k)(k+1)}$$
From then on it's the root-finding algorithm of your choice. Or, as I said, you can make tables, for example here's the complete roots of up to N=20, in double precision. Note: these are unscaled, i.e. calculated for delay, not frequency(!):
I doubt you'll need more, but, if you do, I can copy-paste. The tables are very useful when you have memory to spare, as opposed to cycles. All you need from now on is frequency scaling, as these will get you nice 2nd order stages.
Update: Andy's post remimded me that once I concocted up a frequency scaling formula (from the then
zunzun.com.
, now defunct, sadly), but it works decently. For example, for -3dB point, a sweep from N=2 to 32 gives the difference between the first and the last trace of ~0.31dB, and ~0.0125dB between adjacent traces. It's not perfect, but it works:$$\omega_{scale}(A_{sc})=8091309.68544832\exp\left[-0.5\left(0.09397449321551755(\ln{N}-8.03901973218457)^2+0.009140987415805315\left(\ln{A_{sc}}-54.61336204495193\right)^2\right)\right]+0.02602784079436049$$
where Asc is the attenuation, in dB, at fc and N the order. As a small example, for the same 13th order and 3dB, the scaling would be |H(j4.13082549938354)|, while the formula says |H(j4.125564879197584)|, which gives -3.0025dB (0.7077408150981647). It's not limited to only 3dB: if you want 1.57dB, then \$\omega_{scale}\$ should be 2.99434327282329, while the formula says \$\omega_{scale}\$=3.001850652953856, which results in -1.577946667040319dB (0.8338782890589183). I say it's not bad.
Update: I just tried what you propose, that is, to compare the Bessel with the pole placement on the circle, as the document from analog.com, and wherever elsewhere I read, say. First, since I already have N=13 above, I made the example for N=13. Second, I scaled the Bessel poles to match the X-axis.
Since the imaginary part of the poles are separated by 2/n and placed on a circle (non-displaced), then all you have to do is generate a list based on that, and the realpart is simply \$\Re=\sqrt{1-\Im^2}\$:
And this is how both poles look like compared to the unit circle. From here on, Bessel is blue.
Next, make the transfer functions and compare them. I applied frequency scaling to both so they have -3dB@1Hz: Bessel=
3.277105084487313, p.p.=0.3193551457708009.Interesting enough, the reverse of each is close to the other (less so on lower orders).Notice that Bessel has a steeper rolloff. In addition, the magnitude according to the long polynomial above is also plotted as the dashed green line; since it overlaps with the blue one, the blue one is kept as reference from now on.(wrong: https://i.stack.imgur.com/7PtRa.png)
and the difference between them (it shows zero towards the end because of numerical inaccuracies, given the huge numbers in the original Bessel polynomial -- no longer the case, the transfer function is made up of 2nd order sections, made up of sorted out poles.):
(wrong: https://i.stack.imgur.com/VEbgY.png)
Then, the phases. Not surprisingly, differences due to the different rolloff:
(wrong: https://i.stack.imgur.com/EH5z1.png)
and the difference:
(wrong: https://i.stack.imgur.com/VQIA8.png)
And the most important, the group delay
Bessel is truncated for the same reasons as above).Note that the p.p. method has lower delay, due to the slower rolloff, but also that it is not as flat as Bessel:(wrong: https://i.stack.imgur.com/yWv7g.png)
and the difference (both normalized to 1, for easier comparison):
(wrong: https://i.stack.imgur.com/lVjQI.png)
Update: The flatness of the group delay can be verified with the derivative:
Conclusion: the pole placement is not a Bessel response, but it comes
veryreasonably close, so if you don't mind theminordifferences, this is a very convenient and, perhaps most importantly, cheap way to generate the poles by avoiding the expensive root-finding algorithms. Note, however, that I only used N=13 for this, so, for an attempt at completion's sake, here's what the differences for N=5 look like, in order: magnitude, phase, group delay, update and flatness of group delay:(wrong: https://i.stack.imgur.com/YVA1j.png, https://i.stack.imgur.com/gNKCc.png, https://i.stack.imgur.com/gqegm.png)
As a minor addendum, here are the impulse responses of the two 5th order, with the same Bessel=blue (using the -3dB frequency scaled versions):(wrong: https://i.stack.imgur.com/BG4MF.png)[I added this part at the end]
Well, you've opened up an old wound, congratulations. I thought about modifying the Bessel poles (blue) by projecting them onto the unit circle along the X-axis, so they will lose whatever curve they normally have and convert them, forcefully (red). In addition, by comparison, the black squares are the p.p. method.
and the magnitudes of the Bessel (blue) near the converted Bessel (
reddashed magenta) and p.p. (blackred)-- for some reason, the rolloff is slower for p.p., I must have some typo somewhere, I won't find it today:.(wrong: https://i.stack.imgur.com/vOmao.png)
All for N=13, and the results are consistent for 5, 9, 25, so on it seems. The conclusion remains: not Bessel, but
darnclose. Pick your choice.This should be the last edit (before I go over the rabbit hole's event horizon) to address the issue of the displacement of the underlying circle. Doubts crept up so I wanted to clear this out. From previous pictures, it's clear it's not a circle, it's not some cosh(), but something else, but numbers are clearer than pictures, so I started a reductio ad absurdum: what if it is a circle? Then it should be scaled and displaced. Here's the basic idea:
The circle of radius OM (grey, dashed) is the unit circle, and the blue one of radius MQ would be the underlying circle. At point C is a pole whose coordinates are known. OM is also known, so \$AM=1-\Re(C)\$ amd \$AC=\Im(C)\$ => the red angle (ignore the reading), \$\alpha=\arctan\frac{AM}{AC}\$, while the green angle, \$\beta=\frac{\pi}{2}-\alpha=\arctan\frac{AQ}{AC}\$ => \$MQ=AM+AC\tan\left(\frac{\pi}{2}-\arctan\frac{AM}{AC}\right)\$, with the displacement being a simple subtraction.
But, for N=13, calculating the radius of the underlying circle should be the same for all the poles, yet it isn't. Here's the values for all the radii for the positive imaginary parts of the poles that are not on the X-axis or Y-axis (5 poles, see 1st picture):
[2.882849152139202,2.896382080602158,2.920436970785266,2.957894588441385,3.014834976457822]
[1.448191040223894,1.460218485342188,1.478947294206138,1.507417488214167,1.553654588674449]
And here is a graphical representation of what the underlying circles would be for each of these poles. Note that, not only they don't overlap, but only one of each pole lies on a circle at a time (of course, the one after the circle has been calculated):
This should be the proof that, however flat the group delay is from the pole placement approximation, it's just an approximation, not a Bessel in the true sense. IMHO, this should have been specified by both analog.com and whatever other sources mention it -- that it's an approximation, a
very goodgood one, but not Bessel(-Thompson).And, since the wound goes deep, here's some more. It strikes me that, the larger the order, the more the p.p. method converges towards a Gaussian filter, and, sure enough, here's a plot of a reference gaussian function, \$\exp\left(-\frac{\ln2}{2}x^2\right)\$ (black), an approximated transfer function "a la Bessel" (blue, frequency scaling needed), and a free, non-frequency scaled version of the p.p. method (red):
If you can't see any difference it's because of the small things in life:
Does this mean that the p.p. method converges towards a Gaussian filter? No. Here are the poles of the Gaussian (blue) and the Bessel (red), scaled for comparison with the unit circle:
They're even more spread out. But are they, at least, placed on a circle, displaced or not? Here are the results of the radii of the circles, in a method similar to one for Bessel, above:
[2.216482009751927,2.067108140129843,1.988005879596608,1.939764970045811,1.910091446017072]
and the graphical representation, after another forced attempt at projecting them onto the unit circle (again, same way as Bessel, above):
For completeness, here's the impulse reponse of the Gaussian (blue), compared to the Bessel (red), and the p.p. method (green):
Given this final proof, the p.p. method is even better in terms of time-domain compared to Gaussian, but that also means that, frequency-wise, it's a mess. However, I am thinking of another pproach, but that will be for another day.