To be honest the line between the two is almost gone nowadays and there are processors that can be classified as both (AD Blackfin for instance).
Generally speaking:
Microcontrollers are integer math processors with an interrupt sub system. Some may have hardware multiplication units, some don't, etc. Point is they are designed for simple math, and mostly to control other devices.
DSPs are processors optimized for streaming signal processing. They often have special instructions that speed common tasks such as multiply-accumulate in a single instruction. They also often have other vector or SIMD instructions. Historically they weren't interrupt based systems and operated with non-standard memory systems optimized for their purpose making them more difficult to program. They were usually designed to operate in one big loop processing a data stream. DSP's can be designed as integer, fixed point or floating point processors.
Historically if you wanted to process audio streams, video streams, do fast motor control, anything that required processing a stream of data at high speed you would look to a DSP.
If you wanted to control some buttons, measure a temperature, run a character LCD, control other ICs which are processing things, you'd use a microcontroller.
Today, you mostly find general purpose microcontroller type processors with either built in DSP-like instructions or with on chip co-processors to deal with streaming data or other DSP operations. You don't see pure DSP's used much anymore except in specific industries.
The processor market is much broader and more blurry than it used to be. For instance i hardly consider a ARM cortex-A8 SoC a micro-controller but it probably fits the standard definition, especially in a PoP package.
EDIT: Figured i'd add a bit to explain when/where i've used DSPs even in the days of application processors.
A recent product i designed was doing audio processing with X channels of input and X channels of output per 'zone'. The intended use for the product meant that it would often times sit there doing its thing, processing the audio channels for years without anyone touching it. The audio processing consisted of various acoustical filters and functions. The system also was "hot plugable" with the ability to add some number of independent 'zones' all in one box. It was a total of 3 PCB designs (mainboard, a backplane and a plug in module) and the backplane supported 4 plug in modules. Quite a fun project as i was doing it solo, i got to do the system design, schematic, PCB layout and firmware.
Now i could have done the entire thing with an single bulky ARM core, i only needed about 50MIPS of DSP work on 24bit fixed point numbers per zone. But because i knew this system would operate for an extremely long time and knew it was critical that it never click or pop or anything like that. I chose to implement it with a low power DSP per zone and a single PIC microcontroller that played the system management role. This way even if one of the uC functions crashed, maybe a DDOS attack on its Ethernet port, the DSP would happily just keep chugging away and its likely no one would ever know.
So the microcontroller played the role of running the 2 line character LCD, some buttons, temperature monitoring and fan control (there were also some fairly high power audio amplifiers on each board) and even served an AJAX style web page via ethernet. It also managed the DSPs via a serial connection.
So thats a situation where even in the days where i could have used a single ARM core to do everything, the design dictated a dedicated signal processing IC.
Other areas where i've run into DSPs:
*High End audio - Very high end receivers and concert quality mixing and processing gear
*Radar Processing - I've also used ARM cores for this in low end apps.
*Sonar Processing
*Real time computer vision
For the most part, the low and mid ends of the audio/video/similar space have been taken over by application processors which combine a general purpose CPU with co-proc offload engines for various applications.
What we do for production, is to first load a program into the PIC that tests out the board (using a small test board that independently verifies the 3.3v rail is within spec using a couple of comparators, and then we use the ADC on the PIC to check everything else out. We had enough pins left over to allow this (it required some extra resistors to act as voltage dividers for the voltages over 3v).
After the tests pass, the real production code is flashed into the micro. Some additional tests are run, and the PCB is ready for assembly into a case.
This is all done via a program on the PC that only requires an operator to connect the board, click one button, and wait for the result PASS/FAIL. All test results (including ADC readings) are logged. The entire process (including the programming of the PICs via an ICD 3) is controlled via the PC program, which runs batch scripts to do the actual programming. Communication to the PIC to control the tests is done via one of the UARTs, whose pins are brought out to the test board (so in addition to the pins required for programming, we also have TX/RX as a minimum).
We set up several stations like this at our contract manufacturer.
BTW the ICD 3 is much faster than the ICD 2 (USB 2.0 vs 1.1).
Best Answer
PICs are as cheap as any in their class.
You can buy an entry level 10F series PIC for a bit over 30 cents in modest volume from US distributors.
BUT I understand thay have an arrangement in Asia where they sell the parts untested and the end user is reponsible for testing and the price would be MUCH less.
There are Asian manufacturers who have cloned the older style PICs and offer them at a lesser price than equivalent PICs on the open market at least. I suspect that Microchip match them in volume prices privately. [eg I have just received a quote for LSD NimH batteries from one of the big 3 Chinese battery makers at MOQ quantities that tend to make one's eyes water. Quote includes a written request not to tell their competitors their pricing. I doubt its too secret in reality BUT no doubt the same thing happens in the processor area].
So, make 100,000 toys or 1,000,000 and they will probably sell you basic processors for 10 to 15 cents.
There are Asian sourced 4 bit processors specifically aimed at the very large volume markets but the instruction sets and architectures are very weird* and they need their own tool chains and community support like you'd get for mainstream processors is completely lacking. [For instance an olde Fairchild F8 would be at home amongst them. An RCA CDP1801 woould look positively mainstream].
Here is what APPEARS to be the utter bargain of the mainstream microcontroller world. I have not yet found anything that tells me how many clocks per cycle - and some of the old ST processors such as the ST6 were seriesl bus based internally with maybe 10 clocks per instruction! - but even if this was the case these would setill seem a bargain. At first I thought they may be endline, but the ST site says they are "current" and a number of sellers have them at similar prices. They are the best features per $ microcontroller that I have ever seen.
5 x 10 bit ADC,
3 x capture compare plus
3 timers plus watchdog
IIC , UART, SPI, + ... .
I've yet to find the catch. It may be speed. TBD - but low speed would be OK in many cases at the price.
$US0.91/1 .
$US0.64/100.
$US0.39/1000.
$US0.33/10,000
STM8S003K3/F3 datasheet and pricing