I've had lots of success with FreeRTOS, combine this with an ARM Cortex dev board such as one from Olimex (available from Farnell) - see This Page for a list of supported devices. A Cortex M3 will run at 75MHz and deliver over 80MIPS. ARM code is dead effecient and some ARM Cortex devices include fixed point math functions.
If you want even more grunt, try a Beagleboard or RasberryPI.
As well as FreeRTOS, the latter will run Linux with Linux compiled with config_preempt_rt config option.
Aesthetically, my favorite architecture in many was is the 14-bit series. The 16-bit PIC18Fxx architecture improves some things, but I find somehow the design less aesthetically pleasing. Which architecture you'll like better probably depends upon your design aesthetic, the extent to which your find yourself wishing things were designed differently, and the extent to which such wishing detracts from your enjoyment working with them.
From a design perspective, there's no particular reason why code addresses and data addresses need to be the same. One thing I like about the 14-bit PICs is that adding a number to an instruction address advances by that many instructions. By contrast, on the PIC18X, each instruction takes two addresses. Consequently, computed jumps using an 8-bit selector are confined to a range of 128 instructions rather than 256. It's a small detail, but having a program counter whose lowest bit is non-functional seems unaesthetic.
Also, the PIC18xx parts add a single-cycle hardware multiply, but unfortunately since it requires one operand to be in W but puts the results in a fixed pair of other registers, it can't be used very effectively for multi-precision operations. If I had my druthers, there would be two types of multiply instructions:
- Simple multiply -- Store W into multiplier register, and store op*W into PRODH:W
- Multply-add --Store PRODH+op*multiplier register into PRODH:W
With such a pattern, a 16x16 operation would be rendered as:
movf OP1L,W
mul OP2L
movwf RESULT0
mula OP2H
movff OP2L,MULTR
mula OP2L
movwf RESULT1
mula OP2H
muvwf RESULT2
movff PRODH,RESULT3
Further, arbitrary-length multiplies could be done with an average cost of a little over two cycles per 8x8 partial product, using the repeated pattern:
mula POSTINC0,c
addwfc POSTINC1,f,c
That pattern would multiply one multi-byte number times an 8-bit value and add the result to another multi-byte number.
As it is, I think the best one can do for an extended multiply is to do the multiply to a destination buffer without doing a built-in add, at a cost of six cycles per 8x8 partial product, and then spend another two-cycles per partial product adding that result to the previous 8xN partial result.
movf multiplier,w
mulwf POSTINC0,c
movf PRODL,w,c
addwfc POSTINC1,w
movff PRODH,INDF1
Four times as long as what could be achieved with a slightly different instruction set. I don't know that I've seen any processor which included a function to compute PRODH+Op1*Op2 but it would be a very simple feature to include in shifter-based multiplies, and it facilitates computing arbitrary product widths with fixed hardware cost. Actually, since the PIC takes four hardware clocks per instruction, the hardware required to allow a 16xN or 32xN multiply would be pretty modest; when computing big products, a 16xN or 32xN multiply with suitable register usage would offer a 2x or 4x speedup.
Best Answer
As long as the microcontroller supports interrupts, has enough memory and gives you full control over the state which the CPU assumes (program counter, registers, stack) when you return from an interrupt, full fledged preemptive multitasking is possible. This is generally accomplished by setting up a hardware timer to generate an interrupt at regular intervals, and context switching (AKA changing the currently executing task) in the interrupt service routine (by backing up the registers and stack of the currently running task to RAM, restoring the previously backed up state of another task from RAM and returning from the interrupt). Most microcontrollers can do this, albeit some 8-bit microcontrollers have insufficient RAM for saving the state of multiple pending tasks (like some ATtiny and PIC parts).
Whether or not this is practical is a completely different story. It introduces a lot of extra complexity, some overhead but perhaps most importantly unpredictable timing to the firmware.
I think your problem is better solved with cooperative multitasking (super loop). You just need to refactor the SMS code into a non-blocking form (think state variable + switch statement), so that transmitting a SMS doesn't block the CPU from running your ADC code. Essentially you must remove all blocking code (delays, polling loops) from both your ADC code and your SMS code.
Even simpler would be to just service the ADC in a conversion complete interrupt. This way you can send the SMS with a simpler, blocking driver implementation (programmed as a sequence of steps, branches and delays instead of an explicit state machine), while the ADC is still handled transparently in the "foreground" by your interrupt handler.