C++ – audio file processing in C++

binaryclibraries

I thought of the following project that I want to pursue:

I want to create a C++ program, that can play audiofiles and visualize the amplitudes of individual frequency bands in real time (e.g. with bars).

I did some search on google, and I found that SDL provides libraries that might do the trick. First I read about the format in which an audio file is saved (more specific uncompressed .wav files, as they seem to be the easiest to read). However, even though I now know how I can get the data of the file, I don't know what it tells me. I mean, how many bytes do I have to read to get one sample, how is the sample saved (I think it's not a direct representation of the frequency space), and so I really have no idea how to work with it. As a direct consequence, I neither know how to play the files.
Then I found SDL, and figured it should be able to extract the information that I need. Looking through the SDL wiki I was not able to find out how I have to proceed, though.

I have done some basic C++ programming in the past, I think I'm on a high enough level to do this on the programming side. I have, however, now worked with binary files so far. Also I have a strong maths background, the maths involved are no problem at all.

The ultimate goal would probably be to do the frequency analysis of all audio output or input, but I read that this depends heavily on the operating system used. Also, applying filters in real time would be a nice thing to do, but this should be rather advanced. Hence just playing an audio file and visualizing would be the first goal.

So to sum up, I think my question is the following:

What does the data in an audio file tell me? How can I get it to a state such that I can do a frequency analysis? How can this be done in real time, while playing the audio file? Is SDL a good choice for this, or is there a better way without taking away all the work (I wouldn't like to just use a program that does everything as I need it, I want to do something of my own!)?

Any ideas and inputs are much appreciated!

PS: I'm not sure I'm on the right stackexchange here, but I really couldn't find out where else this would fit.

Best Answer

What does the data in an audio file tell me?

The header tells you the sampling rate, the data format (compressed or raw, integer or float, how many bits per sample), number of channels (mono, stereo or more channels). The actual data describes the amplitude of the signal as a function of time.

How can I get it to a state such that I can do a frequency analysis? How can this be done in real time, while playing the audio file?

Usually you launch an audio i/o background thread which repeatedly calls back your processing function when a new buffer of input audio is available (a few millisecs). In that function you process the audio and may call another function of the audio library to add a buffer to the audio output queue.

You can also update your own data structures, like the FFT of the last x millisecs. You can then mark the FFT graphics window 'dirty' to trigger a redraw on the UI thread.

Is SDL a good choice for this, or is there a better way without taking away all the work

I would recommend JUCE since it contains all what you need, from platform-independent audio input to UI with graphics. There's a demo application which demonstrates all the features and also serves as a code sample collection.