You can do your entire project in a desktop PC. In fact, if I had to do it, I would start with the desktop:
- a .wav file is already sampled at a high frequency, often 44100 or 48000 Hz.
- determining the highest frequency can be done with an FFT. For prototyping, I would link FFTW.
- downsampling to an arbitrary frequency is a bit hard, because downsampling involves low-pass filtering. You need to set up a filter for each frequency. Look at libsamplerate and see how it sets up a SINC function to convolve against.
- Converting back to the original sample rate will involve another low-pass filter. See again libsamplerate.
I believe I would implement this in several passes, for ease of debugging:
- Get everything working in Matlab or Octave first. Octave has libraries to do all the filtering and Fourier analysis.
- Get everything working in C on PC, linking FFTW and libsamplerate for the downsampling / upsampling.
- Rewrite the C code with explicit-width variable types (e.g., int16_t instead of "short") and replace FFTW and libsamplerate with own code so that it compiles standalone.
- In C for the MSP430 or whatever DSP you've got, write interrupt routines to sample data on the ADC and output it on the DAC. Test that this works, just going from input to output.
- Take the working code from step 3 and compile it for the MSP430 or whatever. Then wedge it in to the working code from step 4 to operate on the sampled data between ADC and DAC.
This may seem like a lot of steps, but it is much more likely to make a working result than heroically coding everything up in one huge MSP430 application, then trying to debug it on the dev board.
Let's do some math:
16 tracks * 44 ksample/s *2 channels *16 bit = 22.528 Mbps
This is the minimum speed you need for the SPI interface, if you want to transmit all the data through a single serial port. Can be done, with an adequate clock, but you need a fast SD card (see here for the speed).
Then there is the microcontroller: you have to add 16 tracks and output them through a DAC, so you have 44*2 ksamples for each track, or
$$ 44 \cdot 10^{3} \cdot 2 \cdot 16 = 1408 \cdot 10^{3} $$
16-bit sums for every sample (probably with some scaling to avoid overflow) result in about 1.4 M operations/sec, that can be handled by a good 32-bit microcontroller. Probably a Cortex-M3, or better M4 (but M3 is probably better documented) can work for you.
I've just seen this which can be clocked up to 204 MHz, has 4 SPI interfaces, up to 40 MB/s, and has also a floating point unit that can help in the accumulation process (but may be too slow). You may also use the dual core structure to handle separately the processing and the output.
But for the DAC I think that you should go for an external converter, specifically designed for audio (which means 16 bit probably).
Update
It's not so clear how are you going to manage the 16 different tracks on the SD:
- what about pre-loading tracks on the internal memory of the MCU?
Check the I2S interface, which is a 4-wire serial protocol especially designed for audio applications.
Important question:
You said that you want also to record tracks and save them to the SD card: do you want to do that at the same time? You need the controller to encode the audio in WAV and store it, and the writing bandwith of the SD card is lower.
The looping feature WILL need some buffering memory (may also use the internal memory) because looping requires real time operation, and the SD card will introduce too much latency. You may need an external RAM, and you may also think about storing some data there to reduce delays.
Best Answer
Nyquist showed you have to sample at a rate at least twice the highest frequency you care about. This captures the information in your signal, but also causes artifacts from the frequencies above half the sample rate to show up in your sampled signal. These are called aliases. You therefore need to first eliminate the frequencies that will cause aliases, then sample.
Since no filter has a infinitely sharp cutoff, there will be some frequency range above the highest frequency you care about and below the frequency the anti-aliasing filter attenuates enough for you to get the signal to noise ratio you care about.
Analog filters are usually fairly gentle in their falloff. One approach is to apply a slow-falloff analog filter, sample at a high rate, then digitally filter that with a sharp filter to allow re-sampling at a lower rate. That last step is often called decimation.
For example, let's say you are after good quality voice and you're highest frequency of interest is 8 kHz. You might put a two-pole R-C filter on the signal with each pole at 12 kHz. You might sample the result at 100 kHz, which means anything past 50 kHz had better be attenuated below your noise floor. The analog filter will reduce 50 kHz by 25 dB, which you decide is good enough in this case since you know there will be very little content above 50 kHz to start with.
Theoretically you can take this 100 kHz sample stream and decimate it to 16 kHz, since that's twice the highest frequency you care about. Even a sharp filter, like convolving with a 1000 point sinc, needs some room to work with. Let's say 1/2 octave (that's really sharp), so the absolute minimum sample frequency after decimation would be 23 kHz (8 kHz plus 1/2 octave is 11.3 kHz, times 2 is 22.6 kHz).
You gave no spec on what kind of sound you want to sample, so you'll have to extrapolate to your requirements on your own.