Raising and trapping exceptions has been around for quite a time.
This site says exceptions were introduced in PL/I:
http://www.math.grin.edu/~rebelsky/Courses/CS302/98S/Outlines/outline.02.html
which was in 1967, according to this page (includes an extensive but
not exhaustive chart of computer languages and features):
http://community.borland.com/article/0,1410,22741,00.html
Many languages picked up this technique -- ADA, ALGOL, FORTRAN, ML [...]
Quoted from here.
Wikipedia has more detail about exception handling in PL/1. That page also refers to PL/1 being the first. Of course, this is no scientific proof :-)
As for who in person designed PL/1, the article mentions no names, only various committees at IBM.
Allow me to expand on @Tymski's nice answer.
Let's start with a review of hardware interrupts.
These can occur at any time (assuming they are enabled) and are thus asynchronous to the current execution stream. The CPU accepts hardware interrupts by listening to external lines in parallel with instruction stream execution. When an interrupt request is detected on these external lines the processor allows execution to continue to advance until a good breaking point. For example, a pipelined processor will allow the instructions already started to finish, though not starting any new instructions. When it is ready, the CPU will start interrupt processing. This can be a relatively complex process, also sometimes, or some parts thereof, referred to as a context switch. User mode execution ceases. Privileged mode is entered (i.e. the OS). The current context is saved -- CPU registers holding user mode values, mostly; the location to save is usually somewhere in the privileged context. The processor then gives flow of control to a low level interrupt handler. The privileged context interrupt handler then determines the next stream to execute, such as an appropriate somewhat higher level interrupt handler.
A software interrupt is very similar in mechanism, with the main difference being that it occurs by the execution of a software interrupt instruction, sometimes called a trap. So, these occur synchronously to the currently executing instruction stream. The same general context switch from user mode to privileged mode is performed borrowing the same hardware, which is one reason it is called an interrupt. These traps are typically used for user mode code to accomplish system calls.
A software exception can refer to the same thing, except rather than being triggered by software interrupt instruction, it is triggered by an abnormal condition detected by the CPU in the current instruction stream execution, such as null pointer dereference or integer divide by zero.
An exception, the term used alone, usually refers so a programming language mechanism for detecting and handling synchronous errors in the current thread. These can involve software exception (e.g. be triggered by the CPU for certain instructions), or can be detected by explicit test inserted into the execution code by a compiler or JIT, or by user or library software that explicitly throws.
As @Tymski says, these terms are often interchanged, sometimes erroneously, but also sometimes due to context.
Best Answer
This almost always fails for at least one of your callers, for which this behaviour is incredibly irritating. Don't assume you know best. Tell your users what's happening, not what you assume they should do about it. In many cases it's already clear what a sane course of action should be (and, if it's not, make a suggestion in your user manual).
For example, even the exceptions given in your question demonstrate your broken assumption: a
ServiceTemporaryUnavailable
equates to "try again later", andRateLimitExceeded
equates to "woah there chill out, maybe adjust your timer parameters, and try again in a few minutes". But the user may as well want to raise some sort of alarm onServiceTemporaryUnavailable
(which indicates a server problem), and not forRateLimitExceeded
(which doesn't).Give them the choice.