Let's do some math:
16 tracks * 44 ksample/s *2 channels *16 bit = 22.528 Mbps
This is the minimum speed you need for the SPI interface, if you want to transmit all the data through a single serial port. Can be done, with an adequate clock, but you need a fast SD card (see here for the speed).
Then there is the microcontroller: you have to add 16 tracks and output them through a DAC, so you have 44*2 ksamples for each track, or
$$ 44 \cdot 10^{3} \cdot 2 \cdot 16 = 1408 \cdot 10^{3} $$
16-bit sums for every sample (probably with some scaling to avoid overflow) result in about 1.4 M operations/sec, that can be handled by a good 32-bit microcontroller. Probably a Cortex-M3, or better M4 (but M3 is probably better documented) can work for you.
I've just seen this which can be clocked up to 204 MHz, has 4 SPI interfaces, up to 40 MB/s, and has also a floating point unit that can help in the accumulation process (but may be too slow). You may also use the dual core structure to handle separately the processing and the output.
But for the DAC I think that you should go for an external converter, specifically designed for audio (which means 16 bit probably).
Update
It's not so clear how are you going to manage the 16 different tracks on the SD:
- what about pre-loading tracks on the internal memory of the MCU?
Check the I2S interface, which is a 4-wire serial protocol especially designed for audio applications.
Important question:
You said that you want also to record tracks and save them to the SD card: do you want to do that at the same time? You need the controller to encode the audio in WAV and store it, and the writing bandwith of the SD card is lower.
The looping feature WILL need some buffering memory (may also use the internal memory) because looping requires real time operation, and the SD card will introduce too much latency. You may need an external RAM, and you may also think about storing some data there to reduce delays.
Although the currently available versions don't have a true external address bus (it's coming), you might consider the Microchip PIC32. It's architecture is based on MIPS, dating back to 1988, and is one of the two major RISC instruction sets (the other being ARM). So in that regard it can be considered retro. (A little trivia: the Sony Playstation used a MIPS processor.)
One of the nice features of the PIC32 (and unusual for a 32-bit microcontroller) is you can get several varieties in a DIP package, however the maximum memory available will be limited compared to the surface mount versions. One of the PICs with the largest memory in a 28-pin DIP package is the PIC32MX250F128 with 128KB of Flash (program) memory and 32KB of RAM. It is available from Digi-Key in the US, and Farnell in the UK.
Although the RAM may seem limited, note that PICs are Harvard architecture, meaning the program and data address spaces are separate, and programs are executed out of flash, so you don't need a lot of RAM. (For the purists, the PIC32s are actually modified-Harvard architecture, because it is possible to run programs out of RAM.) The other alternative is Von Neumann architecture (used, for example, in PCs'), where there is one address space for everything and programs usually run out of RAM, one exception being they typically need to have at least some Flash or ROM (called BIOS in a PC) in the processor's addrress space to execute a boot routine to load the OS off a mass storage device or network into RAM. The Z80 (and most microprocessors of its time) also used a Von Neumann architecture. So one had to fit both program and data into 64 KB. Some micros with a Von Neumann architecture also mapped their peripherals into the same 64K address space; others used separate port addressing.
Re the external bus, current PIC32's (but only in surface mount packages, due to the number of pins) have an 8 or 16-bit wide "Parallel Master Port" (PMP) which, coupled with DMA, can transfer data back and forth automatically between the PIC's RAM and external RAM or a peripheral. However this doesn't allow one to access the external memory directly (in the address space of the processor) or run code there. The very newest PIC32MZ family, listed but not yet in stock at Digi-Key, will have a true external address bus, up to 2MB of Flash, 1/2 MB of RAM, and run at 200 MHz.
The PIC32MX250F128 runs at 50 MHz, there are others that run at 80 MHz. It has two serial UART ports; you will need a level converter to translate that to RS232 signals.
Because it is packaged as a DIP, and can run without an external oscillator, to get started all you need is a 3.3.v power supply, some 0.1 µF decoupling caps and a breadboard. You can get a free C compiler and IDE from Microchip.
Once you get the processor up and running, you can add peripherals like an LCD display, buttons (even a keyboard), etc.
You can get other PIC32MX's with up to 512KB of Flash and 128KB of RAM, but only in surface mount packages like TQFP and VQFN that would require you to layout a PCB (you would have this same problem with any ARM processor also).
Best Answer
An interesting option for a retro-computer is to use an FPGA; this 6809 implementation runs on a $99 Digilent Spartan-3 board. I tried it a few years ago - it worked very well with a VGA monitor and PS2 keyboard plugged into the FPGA board. Several similar systems have been designed, including this Apple II which uses an Altera FPGA board.