For professional type use, your major options are IAR, Keil or Rowley CrossWorks. Keil is owned by ARM, which may or may not give them a slight advantage. I'd say the performance between IAR and Keil is nearly identical. Rowley is bargain of the 3. Rowley also let's you use cheaper debuggers, such as the J-link. You might be able to use the J-link with IAR as well, but I think Keil forces you to use their Ulink products, which can be a bit more expensive. As far as support, I believe Rowley is purely through their website. IAR and Keil offer 1 year or so of phone support. From what I've been told, Keil seems to offer better support in the US, while IAR is more focused on Europe. I've used Keil without any issues and support was good. That being said, any of these 3 will probably perform just as well.
Another answer: Stop using interrupts.
People jump to use interrupts too easily. Personally, I rarely use them because they actually waste a lot of time, as you are discovering.
It's often possible to write a main loop which polls everything so rapidly that's it's latency is within spec, and very little time is wasted.
loop
{
if (serial_bit_ready)
{
// shift serial bit into a byte
}
if (serial_byte_ready)
{
// decode serial data
}
if (enough_serial_bytes_available)
{
// more decoding
}
if (usb_queue_not_empty)
{
// handle USB data
}
}
There might be some things in the loop which happen far more often than others. Perhaps the incoming bits for example, in which case, add more of those tests, so that more of the processor is dedicated to that task.
loop
{
if (serial_bit_ready)
{
// shift serial bit into a byte
}
if (serial_byte_ready)
{
// decode serial data
}
if (serial_bit_ready)
{
// shift serial bit into a byte
}
if (enough_serial_bytes_available)
{
// more decoding
}
if (serial_bit_ready)
{
// shift serial bit into a byte
}
if (usb_queue_not_empty)
{
// handle USB data
}
}
There might be some events for which the latency of this approach is too high. For example, you might need a very accurately timed event. In which case, have that event on interrupt, and have everything else in the loop.
Best Answer
For ARM Cortex-M processors there is the CMSIS (the Cortex-M Software Interface Standard). CMSIS is intended to provide a standard method for hardware abstraction that can be used by any Cortex-M vendor. TI almost certainly has created some software that provides CMSIS support for their processors, so you should also look for that. The CMSIS documentation and templates are available from cmsis.arm.com