The difference between the filters you name is not that each new one invented made a closer approximation to the ideal filter, but that each one optimizes the filter for a different characteristic. Because there's a trade-off between different characteristics, each one chooses a different way to make this trade-off.
Like Andy said, the Butterworth filter has maximal flatness in the passband. And the Chebychev filter has the fastest roll-off between the passband and stop-band, at the cost of ripple in the passband.
The Elliptic filter (Cauer filter) parameterizes the balance between pass-band and stop-band ripple, with the fastest possible roll-off given the chosen ripple characteristics.
Now if I was to take my 5th order structure and was able to simulate for every possible inductor value and capacitor value would I find a combination that would give me the best possible / closest model to ideal, that beats all previously known filter types?
It depends what you mean by "best possible" or "closest model". If you mean the one with the flattest response in the pass-band, you'd end up with the Butterworth filter. If you mean the best possible roll-off given a fixed ripple in the pass-band, you'd end up with the Chebychev design, etc.
If you chose some other criterion to optimize (like mean-square error between the filter characteristic and the boxcar ideal, for example), you could end up with a different design.
Do mathematicians / engineers know of a "best" filter response that is physically possible for a given order but so far do not know how to create it.
The filters you named (Butterworth, Chebychev, Cauer) are the best, for the different definitions of "best" that define those filters.
If you had some other definition of "best" in mind, you could certainly design a filter to optimize that, with existing technology. Andy's answer names a couple of other criteria and the filters that optimize them, for example.
Let me add one other question you might ask as a follow up,
Why don't we in practice design filters to optimize the mean-square error between the filter characteristic and the boxcar ideal?
Probably because the mean-square error doesn't capture well the design-impact of
"errors" in the pass-band and stop-band response. Because the ideal response has 0 magnitude in the stop-band it's hard to define a "relative response" measurement that has equal weight in both regions.
For example, in some designs an error of -40 dB (.01 V/V) relative to the ideal 0 V/V response in the stop-band would be much worse than an error of 0.01 V/V in the passband.
If you need a bandwidth of B=440-220=220 Hz the center frequency will be app. at Fo=311 Hz. As a consequence, the required quality factor of your bandpass will be Q=311/220=1.4
Please note that it is NOT possible to realize a bandpass with such a selectivity based on the mentioned approach (lowpass-highpass series, or vice versa).
Therefore, you either need a RLC bandpass configuration or an active RC bandpass topology.
Best Answer
I'm going to list of bunch of "filters that don't overshoot". I hope you'll find this partial answer better than no answer at all. Hopefully people looking for "a filter that doesn't overshoot" will find this list of such filters helpful. Perhaps one of these filters will work adequately in your application, even if we haven't found the mathematically optimum filter yet.
first and second order LTI causal filters
The step response of a first order filter ("RC filter") never overshoots.
The step response of a second order filter ("biquad") can be designed such that it never overshoots. There are several equivalent ways of describing this class of second-order filter that doesn't overshoot on a step input:
In particular, a unity gain Sallen–Key filter topology with equal capacitors and equal resistors is critically damped: Q = 1/2 , and therefore does not overshoot on a step input.
A second-order Bessel filter is slightly underdamped: Q = 1/sqrt(3) , so it has a little overshoot.
A second-order Butterworth filter is more underdamped: Q = 1/sqrt(2) , so it has more overshoot.
Out of all possible first-order and second-order LTI filters that are causal and do not overshoot, the one with the "best" (steepest) frequency response are the "critically damped" second-order filters.
higher-order LTI causal filters
The most commonly-used higher-order causal filter that has an impulse response that is never negative (and therefore never overshoots on a step input) is the "running average filter", also called the "boxcar filter" or the "moving average filter".
Some people like to run data through one boxcar filter, and the output from that filter into another boxcar filter. After a few such filters, the result is a good approximation of the Gaussian filter. (The more filters you cascade, the closer the final output approximates a Gaussian, no matter what filter you start with -- boxcar, triangle, first-order RC, or any other -- because of the central limit theorem).
Practically all window functions have an impulse response that is never negative, and so in principle can be used as FIR filters that never overshoot on a step input. In particular, I hear good things about the Lanczos window, which is the central (positive) lobe of the sinc() function (and zero outside that lobe). A few pulse shaping filters have an impulse response that is never negative, and so can be used as filters that never overshoot on a step input.
I don't know which of these filters is the best for your application, and I suspect the mathematically optimum filter may be slightly better than any of them.
non-linear causal filters
The median filter is a popular non-linear filter that never overshoots on a step-function input.
EDIT: LTI noncausal filters
The function sech(t) = 2/( e^(-t) + e^t ) is its own Fourier transform, and I suppose could be used as a kind of non-causal low-pass LTI filter that never overshoots on a step input.
The non-causal LTI filter that has the (sinc(t/k))^2 impulse response has a "abs(k)*triangle(k*w)" frequency response. When given a step input, it has a lot of time-domain ripple, but it never overshoots the final settling point. Above the high-frequency corner of that triangle, it gives perfect stop-band rejection (infinite attenuation). So in the stop band region, it has better frequency response than a Gaussian filter.
Therefore I doubt the Gaussian filter gives the "optimal frequency response".
In the set of all possible "filters that don't overshoot", I suspect there is no one single "optimal frequency response" -- some have better stop-band rejection, while others have narrower transition bands, etc.