C++ Exceptions – Using Exceptions as Asserts or Errors

cexceptions

I'm a professional C programmer and a hobbyist Obj-C programmer (OS X). Recently I've been tempted to expand into C++, because of its very rich syntax.

So far coding I haven't dealt much with exceptions. Objective-C has them, but Apple's policy is quite strict:

Important You should reserve the use of exceptions for programming or unexpected runtime errors such as out-of-bounds collection access, attempts to mutate immutable objects, sending an invalid message, and losing the connection to the window server.

C++ seems to prefer using exceptions more often. For example the wikipedia example on RAII throws an exception if a file can't be opened. Objective-C would return nil with an error sent by a out param. Notably, it seems std::ofstream can be set either way.

Here on programmers I've found several answers either proclaiming to use exceptions instead of error codes or to not use exceptions at all. The former seem more prevalent.

I haven't found anyone doing an objective study for C++. It seems to me that since pointers are rare, I'd have to go with internal error flags if I choose to avoid exceptions. Will it be too much bother to handle, or will it perhaps work even better than exceptions? A comparison of both cases would be the best answer.

Edit: Though not completely relevant, I probably should clarify what nil is. Technically it's the same as NULL, but the thing is, it's ok to send a message to nil. So you can do something like

NSError *err = nil;
id obj = [NSFileHandle fileHandleForReadingFromURL:myurl error:&err];

[obj retain];

even if the first call returned nil. And as you never do *obj in Obj-C, there's no risk of a NULL pointer dereference.

Best Answer

C++ seems to prefer using exceptions more often.

I would suggest actually less than Objective-C in some respects because the C++ standard library would not generally throw on programmer errors like out-of-bounds access of a random-access sequence in its most common case design form (in operator[], i.e.) or trying to dereference an invalid iterator. The language doesn't throw on accessing an array out of bounds, or dereferencing a null pointer, or anything of this sort.

Taking programmer mistakes largely out of the exception-handling equation actually takes away a very large category of errors that other languages often respond to by throwing. C++ tends to assert (which doesn't get compiled in release/production builds, only debug builds) or just glitch out (often crashing) in such cases, probably in part because the language doesn't want to impose the cost of such runtime checks as would be required to detect such programmer mistakes unless the programmer specifically wants to pay the costs by writing code that performs such checks himself/herself.

Sutter even encourages avoiding exceptions in such cases in C++ Coding Standards:

The primary disadvantage of using an exception to report a programming error is that you don't really want stack unwinding to occur when you want the debugger to launch on the exact line where the violation was detected, with the line's state intact. In sum: There are errors that you know might happen (see Items 69 to 75). For everything else that shouldn't, and it's the programmer's fault if it does, there is assert.

That rule isn't necessarily set in stone. In some more mission-critical cases, it might be preferable to use, say, wrappers and a coding standard which uniformly logs where programmer mistakes occur and throw in the presence of programmer mistakes like trying to deference something invalid or access it out of bounds, because it might be too costly to fail to recover in those cases if the software has a chance. But overall the more common use of the language tends to favor not throwing in the face of programmer mistakes.

External Exceptions

Where I see exceptions encouraged most often in C++ (according to standard committee, e.g.) is for "external exceptions", as in an unexpected result in some external source outside the program. An example is failing to allocate memory. Another is failing to open a critical file required for the software to run. Another is failing to connect to a required server. Another is a user jamming an abort button to cancel an operation whose common case execution path expects to succeed absent this external interruption. All of these things are outside of the control of the immediate software and the programmers who wrote it. They're unexpected results from external sources that prevent the operation (which should really be thought of as an indivisible transaction in my book*) from being able to succeed.

Transactions

I often encourage looking at a try block as a "transaction" because transactions should succeed as a whole or fail as a whole. If we're trying to do something and it fails halfway through, then any side effects/mutations made to the program state generally need to be rolled back to put the system back into a valid state as though the transaction was never executed at all, just as an RDBMS which fails to process a query halfway through should not compromise the integrity of the database. If you mutate program state directly in said transaction, then you must "unmutate" it on encountering an error (and here scope guards can be useful with RAII).

The much simpler alternative is don't mutate the original program state; you might mutate a copy of it and then, if it succeeds, swap the copy with the original (ensuring the swap cannot throw). If it fails, discard the copy. This also applies even if you don't use exceptions for error handling in general. A "transactional" mindset is key to proper recovery if program state mutations have occurred prior to encountering an error. It either succeeds as a whole or fails as whole. It does not halfway succeed in making its mutations.

This is bizarrely one of the least frequently discussed topics when I see programmers asking about how to properly do error or exception handling, yet it is the most difficult of them all to get right in any software that wants to directly mutate program state in many of its operations. Purity and immutability can help here to achieve exception-safety just as much as they help with thread-safety, as a mutation/external side effect which does not occur need not be rolled back.

Performance

Another guiding factor in whether or not to use exceptions is performance, and I don't mean in some obsessive, penny-pinching, counter-productive way. A lot of C++ compilers implement what's called "Zero-Cost Exception Handling".

It offers zero runtime overhead for an error-free execution, which surpasses even that of C return-value error handling. As a trade-off, the propagation of an exception has a large overhead.

According to what I've read about it, it makes your common case execution paths require no overhead (not even the overhead that normally accompanies C-style error code handling and propagation), in exchange for heavily skewing the costs towards the exceptional paths (which means throwing is now more expensive than ever).

"Expensive" is a bit hard to quantify but, for starters, you probably don't want to be throwing a million times in some tight loop. This kind of design assumes that exceptions aren't occurring left and right all the time.

Non-Errors

And that performance point brings me to non-errors, which is surprisingly fuzzy if we look at all sorts of other languages. But I would say, given the zero-cost EH design mentioned above, that you almost certainly do not want to throw in response to a key not being found in a set. Because not only is that arguably a non-error (the person searching for the key might have built the set and expect to be searching for keys that don't always exist), but it would be enormously expensive in that context.

For example, a set intersection function might want to loop through two sets and search for keys they have in common. If failing to find a key threw, you'd be looping through and might be encountering exceptions in half or more of the iterations:

Set<int> set_intersection(const Set<int>& a, const Set<int>& b)
{
     Set<int> intersection;
     for (int key: a)
     {
          try
          {
              b.find(key);
              intersection.insert(other_key);
          }
          catch (const KeyNotFoundException&)
          {
              // Do nothing.
          }
     }
     return intersection;
}

That above example is absolutely ridiculous and exaggerated, but I have seen, in production code, some people coming from other languages using exceptions in C++ somewhat like this, and I think it's a reasonably practical statement that this is not an appropriate use of exceptions whatsoever in C++. Another hint above is that you'll notice the catch block has absolutely nothing to do and is just written to forcibly ignore any such exceptions, and that's usually a hint (though not a guarantor) that exceptions are probably not being used very appropriately in C++.

For those types of cases, some type of return value indicating failure (anything from returning false to an invalid iterator or nullptr or whatever makes sense in the context) is usually far more appropriate, and also often more practical and productive since a non-error type of case usually doesn't call for some stack unwinding process to reach the analogical catch site.

Questions

I'd have to go with internal error flags if I choose to avoid exceptions. Will it be too much bother to handle, or will it perhaps work even better than exceptions? A comparison of both cases would be the best answer.

Avoiding exceptions outright in C++ seems extremely counter-productive to me, unless you're working in some embedded system or a particular type of case which forbids their use (in which case you'd also have to go out of your way to avoid all library and language functionality that would otherwise throw, like strictly using nothrow new).

If you absolutely have to avoid exceptions for whatever reason (ex: working across C API boundaries of a module whose C API you export), many might disagree with me but I'd actually suggest using a global error handler/status like OpenGL with glGetError(). You can make it use thread-local storage to have a unique error status per thread.

My rationale for that is that I'm not used to seeing teams in production environments thoroughly check for all possible errors, unfortunately, when error codes are returned. If they were thorough, some C APIs can encounter an error with just about every single C API call, and thorough checking would require something like:

if ((err = ApiCall(...)) != success)
{
     // Handle error
}

... with almost every single line of code invoking the API requiring such checks. Yet I've not had the fortune of working with teams that thorough. They often ignore such errors half, sometimes even most, of the time. That's the biggest appeal to me of exceptions. If we wrap this API and make it uniformly throw on encountering an error, the exception cannot possibly be ignored, and in my view, and experience, that is where the superiority of exceptions lie.

But if exceptions cannot be used, then the global, per-thread error status at least has the advantage (a huge one compared to returning error codes to me) that it might have a chance to catch a former error a bit later than when it occurred in some sloppy codebase instead of outright missing it and leaving us completely oblivious about what happened. The error might have occurred a few lines before, or in a previous function call, but provided the software hasn't crashed yet, we might be able to start working our way backwards and figuring out where and why it occurred.

It seems to me that since pointers are rare, I'd have to go with internal error flags if I choose to avoid exceptions.

I wouldn't necessarily say pointers are rare. There are even methods now in C++11 and onwards to get at the underlying data pointers of containers, and a new nullptr keyword. It's generally considered unwise to use raw pointers to own/manage memory if you can use something like unique_ptr instead given how critical it is to be RAII-conforming in the presence of exceptions. But raw pointers that don't own/manage memory aren't necessarily considered so bad (even from people like Sutter and Stroustrup) and sometimes very practical as a way to point to things (along with indices that point to things).

They're arguably no less safe than the standard container iterators (at least in release, absent checked iterators) which will not detect if you try to dereference them after they're invalidated. C++ is still unashamedly a bit of a dangerous language, I'd say, unless your specific use of it wants to wrap everything and hide even non-owning raw pointers away. It is almost critical with exceptions that resources conform to RAII (which generally comes at no runtime cost), but other than that it's not necessarily trying to be the safest language to use in favor of avoiding costs that a developer doesn't explicitly want in exchange for something else. The recommended use isn't trying to protect you from things like dangling pointers and invalidated iterators, so to speak (otherwise we'd be encouraged to use shared_ptr all over the place, which Stroustrup vehemently opposes). It's trying to protect you from failing to properly free/release/destroy/unlock/clean up a resource when something throws.