Electronic – Why an 8kHz notch on the headset mic input of a Qualcomm WCD9330

adcaudiofiltermicrophonesmartphones

I have here a smartphone which uses a Qualcomm WCD9330 audio codec IC. (Itʼs a Galaxy Note 4.) Iʼve discovered through testing that the headset microphone input exhibits a roughly 10dB notch in its frequency response at about 8kHz, and, as well as being concerned, Iʼm also curious as to what are possible engineering explanations for this, intentional or otherwise. (In part, Iʼm also trying to figure out whether this is a characteristic of this model, or if the phone has been damaged.)

Rewinding a bit, once I knew the IC involved, I of course grabbed its device specification (this actually turned out to be for a different but very similar device — canʼt find the spec for the WCD9330, but even the WCD9335 is similar). At this point, all I knew was, recorded sound was muffled, like itʼd gone through a low‑pass filter. The ADCs have sample rates of 8, 16, 32, 48, 96, and 192 kHz, so it occurred to me, particularly when viewing the 16kHz frequency response chart on page 30, that maybe the phoneʼs firmware has the hardware sample rate stuck at 16kHz (even though the recording application asked for and was being fed 44.1kHz). I eventually decided to test out this theory physically.

I have access to a sweep generator, oscilloscope, etc., but not having them handy at the apartment, I went with a more ghetto approach. I created a WAV file on my PC with 28 pure sine tones, 0dBFS, 500ms each, on 1kHz intervals from 1kHz to 20kHz and 500Hz intervals in the area of interest where I figured the roll‑off was. I just wired the PC sound interfaceʼs output straight to the microphone leads on a 3.5mm TRRS plug — I know, terrible! It did basically work, though. I used a third-party recording app on the phone to capture raw 44.1kHz PCM into a WAV file so that there shouldnʼt be any interference by a lossy data compressor.

Initially I turned the PCʼs software volume control all the way down, not wanting to swamp a mic‑level input, but ultimately I found I had to set it fairly high, around 75%, to get a signal that got anywhere near 0dBFS in the recording, and there was still no clipping. This may also be part of the problem, and is consistent with the observation that headset recordings on this phone come out with a rather low signal level.

I pulled the resulting file into an audio program on the PC that can do RMS power and FFT analysis on selected regions. The result is not quite what I was expecting.

Well, first of all, it was full of harmonic distortion. I blame that on a combination of fairly cheap input and output circuits and the hideous way I had them connected. I should probably have used a load resistor to correctly impedance match, and also maybe use a DC blocking capacitor. (Mind you, on the latter point, I didnʼt see any significant DC bias in the result.) Not to mention, I didnʼt consider whether the phone powers the microphone through those same wires. But I was anxious and lazy. And the main tones were still more or less clearly discernible on FFTs in spite of also having strong harmonics, so I figured the results would still mean something, and it looks like they did.

When I saw a dip towards 8kHz I figured, hey, I was right, itʼs sampling at 16kHz. But then the response curve mystified me by coming back up. From 14kHz through 20kHz, the response is back at its maximum! All the way up to 20kHz, the principal frequency is the strongest peak, by a clear margin; so per Nyquist, the phone must be using 48, 96, or 192kHz sampling after all!

Thatʼs where I really start scratching my head. Itʼs like itʼd been put through an 8kHz notch filter… but why?? Hereʼs the results:

Frequency Response Plot

(The blue plot is rounded to the nearest dB, thus the bumps. Technically, the total RMS should probably have been higher than the principal, but these were measured using different tools.)

Iʼm pretty sure this is not an artefact of my sloppy measuring setup, because this very much mirrors what I was seeing and hearing in recordings made from the Samsung headsets: muted sibilants, and power spectra that took a dip around 8kHz. It also confirmed my observation that the exact same headset in another model Samsung phone (Note II) didnʼt produce this muffled sound; itʼs now doubly clear that itʼs something in the phone itself.

I wish I had a second identical unit to test this on to rule out something actually broken in the phone, but I donʼt. Iʼm not even sure how something would break in such a targeted way!

Iʼm struggling to understand what reasonable explanations, either intentional or otherwise, might exist for this very specific notch.

Maybe itʼs a design flaw in the WCD9330, but if you ask me, a two‑octave 10dB notch at 8kHz would be such a remarkably embarassing flaw that I canʼt possibly see a company half as reputable as Qualcomm releasing such a thing. Itʼs so huge I was able to characterize it with the engineering equivalent of a sundial!

Now, Iʼm aware that ADCs can produce strong aliasing artefacts where the input has any components above the Nyquist frequency, and therefore have to have an analog low‑pass filter ahead of them. Itʼs occurred to me that maybe some firmware dev forgot to set the AAF cutoff frequency appropriately for the sample rate being used, but this explanation seems, well, a little strange, for two reasons. (i) Arbitrarily high frequencies can produce aliasing, so an AAF should be a low‑pass, and I suspect theyʼre simpler than notches, too; and (ii) I canʼt think of any use case for setting the AAF independently of the sample rate, so Iʼd kinda-sorta assume the chipset would lock the two together at the hardware level, although I havenʼt confirmed this!

Another dimension to this problem is that the signal level from the headset mic is really low, and I suspect this is probably due to a setting by the firmware (thus an Android SE question). I donʼt know if a really low gain, or attenuation, on the input amp could affect the frequency response of the input in this manner. Is that possible or likely? (If so, why?)

The WCD93xx also have a pair of five-stage digital IIR filters, which could of course be configured to produce the notch, but Iʼm under the impression phones normally use the IIR for sidetone, not for muffling the microphone!  XD

I donʼt know for sure if the phone has any circuitry ahead of the WCD9330, but from what I can glean, it probably doesnʼt have much. I was able to dig up the service manual for a closely related model that is basically the same except for which UMTS and LTE bands are supported. Itʼs for the SM‑N910F rather than the SM‑N910W8 which I have here. This is what it shows for the headset circuit:

Samsung SM-N910F Headset Audio Circuit

The big IC is of course the WCD9330, as best as I can tell (the manual is missing the part manifest!) The manual doesnʼt show where EAROUT_L, EAROUT_R, HPH_REF, EAR_MIC_P, and EAR_MIC_N go, but Iʼm more or less assuming these go directly (via a separate PCB) to the 3.5mm jack. This seems to be also suggested by the relevant part of the “typical application” diagram of the very similar WCD9311:

Qualcomm WCD9311 Typical Application — Headset Audio

One thing that confuses me, however, is that HPH_REF and EAR_MIC_N are shown as separate lines and even have separate resistors, but in fact the phoneʼs headset uses a TRRS (four conductor) plug, on which the ground ring is shared between the headphones and headset mic.  Hmmmmm. So whatʼs going on, there?

Anyhow, my main question is, what are possible/likely explanatons for this 8kHz notch?

EDIT:  I havenʼt objectively tested, but by ear, this loss of fidelity does not seem to impact recordings made using the internal mics.  It seems to only be on the headset port.  Given the explanation in Brian Drummondʼs answer, this further makes it look like a firmware oversight.

Best Answer

The key word here is "phone" ... when telephone systems (wired) were first digitized, the sample rate was standardised at 8 kHz, and on reconstruction, an analog filter cut off high frequencies above 3.4 kHz, slightly below the Nyquist frequency (fs/2 or 4 kHz).

An unfortunate consequence of that is that the spectrum of older digitised phone systems (i.e. most landlines) absent that 3.4 kHz filter, contains a lot of unwanted content around 8 kHz (aliasing as Andy says), and making a call to such a system from a modern phone would be quite an unpleasant experience.

A fairly broad 10dB notch at that frequency looks like an unhappy compromise between fidelity on modern systems and excruciatingly nasty sound on older ones.

It would be possible to turn off that notch when playing music, or maybe on Whatsapp etc where the source of the data is confidently known to have a higher sample rate; but that might well not be possible for a random phone call to/from an unknown phone system.