This sounds like a job for embedded Linux.
- Persistent file system. Forget IDE (save yourself 40 pins) and
go for a board that uses a flash card.
- More RAM and Flash. Typical embedded Linux boards have RAM in
the megabytes.
As for peripherals, driving 40 servos could be a question here on its own. How are you doing this now? For the rest of your peripheral requirements, here's a board that seems to fit that has a good community as well:
http://beagleboard.org/static/beaglebone/latest/Docs/Hardware/BONE_SRM.pdf
The tool chain has a C++ compiler, it has SPI, UARTs, and even a PWM. This is what's being claimed in the PDF at least, you'll have to make sure that there are drivers for all those peripherals available to you at the application level for whichever distro of Linux that you put on. Hopefully the one they provide has everything you need.
So basically, if you can port whatever you've written to a Linux PC, there's a good chance you can port it to an embedded linux target. However, I'm willing to bet that if all you're using from C++ is <fstream>
and <string>
, you could probably do a C re-write and save yourself the overhead of Linux.
Although the currently available versions don't have a true external address bus (it's coming), you might consider the Microchip PIC32. It's architecture is based on MIPS, dating back to 1988, and is one of the two major RISC instruction sets (the other being ARM). So in that regard it can be considered retro. (A little trivia: the Sony Playstation used a MIPS processor.)
One of the nice features of the PIC32 (and unusual for a 32-bit microcontroller) is you can get several varieties in a DIP package, however the maximum memory available will be limited compared to the surface mount versions. One of the PICs with the largest memory in a 28-pin DIP package is the PIC32MX250F128 with 128KB of Flash (program) memory and 32KB of RAM. It is available from Digi-Key in the US, and Farnell in the UK.
Although the RAM may seem limited, note that PICs are Harvard architecture, meaning the program and data address spaces are separate, and programs are executed out of flash, so you don't need a lot of RAM. (For the purists, the PIC32s are actually modified-Harvard architecture, because it is possible to run programs out of RAM.) The other alternative is Von Neumann architecture (used, for example, in PCs'), where there is one address space for everything and programs usually run out of RAM, one exception being they typically need to have at least some Flash or ROM (called BIOS in a PC) in the processor's addrress space to execute a boot routine to load the OS off a mass storage device or network into RAM. The Z80 (and most microprocessors of its time) also used a Von Neumann architecture. So one had to fit both program and data into 64 KB. Some micros with a Von Neumann architecture also mapped their peripherals into the same 64K address space; others used separate port addressing.
Re the external bus, current PIC32's (but only in surface mount packages, due to the number of pins) have an 8 or 16-bit wide "Parallel Master Port" (PMP) which, coupled with DMA, can transfer data back and forth automatically between the PIC's RAM and external RAM or a peripheral. However this doesn't allow one to access the external memory directly (in the address space of the processor) or run code there. The very newest PIC32MZ family, listed but not yet in stock at Digi-Key, will have a true external address bus, up to 2MB of Flash, 1/2 MB of RAM, and run at 200 MHz.
The PIC32MX250F128 runs at 50 MHz, there are others that run at 80 MHz. It has two serial UART ports; you will need a level converter to translate that to RS232 signals.
Because it is packaged as a DIP, and can run without an external oscillator, to get started all you need is a 3.3.v power supply, some 0.1 µF decoupling caps and a breadboard. You can get a free C compiler and IDE from Microchip.
Once you get the processor up and running, you can add peripherals like an LCD display, buttons (even a keyboard), etc.
You can get other PIC32MX's with up to 512KB of Flash and 128KB of RAM, but only in surface mount packages like TQFP and VQFN that would require you to layout a PCB (you would have this same problem with any ARM processor also).
Best Answer
There are many microcontrollers out there (a microprocessor is slightly different), and propably an ARM is the first choice if you want performance. Hook up RAM and processor is conceptually trivial, but the practice can be complicated if you are not expert.
RAM is probably embedded if you buy a microcontroller, so just load the software from the Flash; and wait for using operating systems, first try to play a little bit with the board.
If you don't have a specific requirement for multiprocessor systems (and you should have them in the same chip, I don't think is feasible with microcontrollers) then it makes no sense to get crazy with that kind of stuff. And if you have to learn, I would suggest again of starting with something simple, you can't think to build a computer without even have build a microcontroller board, at least once.
And probably you are expert in programming, but even if you have the PCB made, programming with microcontrollers requires the management of the hardware, and that's also a thing to learn before going further.