Electronic – Avoiding echo/feedback on speaker-phones, how

audiobidirectionaldsp

Even a $20 mobile phone with a speakerphone has no problems of feedback. While I understand that companies like Mediatek have use crazy volumes to bring down mobile chipset prices so low, but reading some articles I get the impression that the circuitry/electronics to suppress/remove feedback from such speaker-phone arrangements where speaker and mic are placed in very close proximity, is fairly complex (involve powerful DSP), involved and expensive. Am I missing something very basic here ? Are there some environmental constraints that are used, in case of mobile-phone to simplify the design of such circuitry and there-by, keep the costs low ?

I am approaching this from the study of an el-cheapo baby-monitor with 2-way "talk" feature, where, in the problem of feedback is terrible. I've tried several things s.a. replacing the electret microphone, speaker type of this device, to no avail. This audio codec used on this device is apparently an ALC one but much of the chip's surface is etched, but I do know that the processor is a Winbond ARM7. It had a shiny sticker on top which I managed to scratch-off to reveal the part number.

Best Answer

The audio processing algorithm you are interested in is called "Acoustic Echo Cancellation", or AEC. It is most commonly used in speakerphones to remove the output of the speaker from the mic signal. Most of this benefit is to the person on the other end of the phone call, since he won't be hearing echos of himself.

Some cheap and not so cheap speakerphones don't use AEC. I have a Polycom speakerphone which is "half duplex". Meaning that when one side is talking, the other side is muted. Because of this, there is no chance for echos or feedback to happen. Unfortunately, this also allows for a "filibuster"-- if one side never shuts up then the other side can never interrupt.

There are many types of AEC algorithms, and almost every type is patented. Most of them involve some form of modeling, where a model of the "speaker to mic acoustic signal path" is created. Once created, we can predict how the speaker output will be picked up by the mic, and thus remove that signal from the mic, leaving only the intended sounds in the mic signal.

This model would thus figure out how the sounds reflect off of the walls and other things in the room, etc. The patents for AEC usually center around exactly how this model is initially created and later updated as things change in the room (mic position, position of people and furniture, etc).

In addition to the "room model", there are other noise-reduction algorithms used. While these algorithms are not technically part of AEC, there are no useful implementations of AEC that don't use these. Normally there is some sort of simple noise-gate (or a multi-band noise gate). Other algorithms are also typically used, but are either patented or treated as a "trade secret"-- which is why I can't tell you about them! :(

Most AEC algorithms operate on a limited frequency range, 300 Hz to 3 KHz, which is the same frequency range as most telephones. Increasingly, wide-band AEC is becoming popular with the advent of higher-bandwidth teleconferencing/telepresence systems.

AEC algorithms are very computationally expensive, and the wide-band AEC requires several times more horsepower than the more limited versions. It is not uncommon for a single "run-of-the-mill" DSP to only be able to do 1 or 2 channels of AEC. For a high quality wide-band AEC, a single high-powered DSP might be required for a single channel.

AEC algorithms are also very difficult to implement. In the entire USA, there are perhaps only 10 or 20 people who have the ability to write a good one. One very smart person that I know just wrote a wide-band AEC algorithm and it took him over a year!

For a 2-way baby monitor, I highly recommend using a half-duplex approach!