Electrical – How to detect the fundamental frequency of a human singing voice signal

audiodspfft

I want to make a digital hardware chip that can input a human voice, (singing simple wordless long notes) and output a square wave that is the same frequency as the fundamental frequency of the voice. I want it to work in real time.

I would like to put the chip on my own PCB eventually. But I could start out by testing the concept on some kind of development environment that has the same chip that I would be using.

Can anyone tell me the best way to approach this problem? For example what type of algorithm would be best, and what chip would be good to use.

I read that autocorrelation and FFT are both used for this kind of thing, but I don't know which would be best.

I was thinking of using the STM32F427, because it seems to be very popular and is available in all sorts of hobby related development environments.

I don't expect anyone to tell me exactly how to do it in every detail. I just need pointing in the right direction.

Best Answer

This might fit far better on signals.stackexchange.com, if you rephrased it as signal processing question

Anyway, don't start with "I want to make a dedicated chip". Start with, I want to understand how something like that can be done, and then I will pick the tools, and pick implementations.

However, the question "how to best detect pitch of human singing is a very complicated one and far from easy to answer – even on a purely music-theoretical point of view, voice doesn't have one fundamental frequency, unless sung for the effect of producing the perfect tone.

You can, of course, try to detect the dominant tone in a song – and that's a pretty common question on signals.stackexchange.com, so I can only encourage you to search that – but it's still a pretty good question whether what your algorithm detects as dominant represents what a human might perceive as the tone of singing – humans are far from uniform, and that doesn't stop at the perception of music.

A small Cortex-M4F like the STM microcontroller you mention might be suitable for many of the algorithms that you will have to take into consideration, but many other's won't work.

So, one of the important rules of engineering applies: First understand your problem, then pick the tools. That applies to things like the FFT just as much as to the compute platform you'll be running this on.

Any reasonable approach for this will consist of first designing the the DSP on a PC-style computer, trying it against recorded digital signals (thus, audio files), refining it, then porting it to whatever platform you chose to put on your PCB.

That's one of the strengths of DSP: it's really just math. You can do it on a PC just as well as you can do it on a microcontroller, as long as the math you do does the same.