Exception Handling – Failure Atomicity in Desktop Applications

desktop applicationexceptionsjavaswing

When it comes to exception handling, there are many guidelines and best practices on the web. On of them is to throw early, catch late, or even Don't Catch. So when facing an exception, the current use case would be abandoned by letting the exception bubble up, until some global exception handler catches it, logs it and shows an error to the user.

I like this approach and I would like to refactor our application according to it. However, it is a desktop application (Java Swing), and I am not sure whether the approach is more applicable to web applications and less to desktop applications. For example, when Victor Rentea says the following, it is specific for web apps:

In web apps today, when handling exceptions we don't recover, we die!

He doesn't give any reason why he talks only about web apps.
My own understanding is that for web applications, it is easier to achieve failure atomicity, i.e. leaving the application in a consistent state in case of failures.

Processing a request on the server typically does not alter the state, except in a DB, and that can be easily made atomic using transactions. When encountering an exception, the transaction is rolled back, and an error response indicating the failure is sent to the client. The server remains in the same state as it was before the request.

In a desktop application things can be very different. There can be long running background tasks constantly updating the UI to show intermediate progress. Or there could be some low level "drawing" logic, placing individual shapes like circles and rectangles on the screen (e.g. paint() in Java Swing). Aborting such a workflow somewhere in the middle very likely leaves the application or the UI in an inconsistent state.

Maybe it is possible to improve failure atomicity by applying the Functional Core, Imperative Shell pattern, but it is hard to enforce in frameworks which are not designed for functional programming such as Java Swing.

So what are some best practices for exception handling that are applicable to desktop applications? Is it still Throw Early, Catch Late, only that late is as late as possible but not later, which is earlier than in web applications? How can one find the "latest" places where exceptions must be catched and enforce that they are catched indeed?

Or are there any other best practices specific for desktop applications? As I said, I am using Java Swing, but that shouldn't be relevant.

Best Answer

Different applications have different boundaries where failures can be contained. In a web app backend this is typically fairly simple because each request can be handled separately. If there's an exception while handling one request, that request can fail and the server can continue with the next request. There is no state kept from one request to the next.

GUI applications are typically far more complicated and often feature complex object graphs that make it difficult to find such clear boundaries. But it's not necessarily impossible. For example, a failure within a button click event handler can perhaps be caught within that handler, so that the application can continue running and handle future button clicks just fine.

There are some architecture-level ideas that can help find boundaries at which errors can be meaningfully contained. In Domain-Driven Design, we have the ideas of a bounded context. You shouldn't share data models across a context boundary, so that each context can have its own models. At the boundaries, the data is typically translated/transferred via data transfer objects. Exceptions are hidden return values, and should be handled similarly. In a GUI application, the GUI part and the business logic parts are ideally decoupled. This is commonly achieved through architectures such as MVC, MVP, or MVVM. However the component that mediates between the GUI/View and the business logic/model is called, this component can potentially handle errors and let the user continue.

Not all errors can be handled, and sometimes crashing the application is the only reasonable result. For example, a GUI application might have errors not from the model/business logic but from the GUI/view components. Or a web server might have errors not from the request handler but from the accept loop. Those cannot be contained as easily. A GUI might be more flexible because it could retry reloading an entire screen, but that's far more invasive than just failing a single action or event handler.

You are completely right that errors can only be ignorable/retryable if the underlying logic works in an “atomic”/transactional manner and does not leave the data model in a partly-modified, potentially illegal state. Object-oriented systems are a partial solution to this problem. Ideally, each object ensures that it cannot enter illegal states and always represents a valid state. Especially in C++, it is common to think about different levels of exception safety guaranteed by an object. At the least, each object should ensure basic safety, that it won't enter unwanted states or leak resources. This is fairly straightforward to achieve, e.g. using RAII/try-with-resource. Some objects offer strong exception safety, which means that operations either succeed entirely or leave the object in its original state. This typically requires explicit care by the programmer as soon as multiple variables are involved.

As a super trivial example, this change to the current object might not be exception-safe:

this.a = foo();
this.b = bar(); // might throw, having updated "a" but not "b"

Whereas deferring the assignments until all fallible operations are done might provide strong exception safety.

var a = foo();
var b = bar();
this.a = a;
this.b = b;

But objects do not live in isolation. They form a complex object graph. Changes to this graph are difficult to undo. That is where a “functional core” design can be very valuable. In such a design, the business logic prepares changes or new states, but does not apply them. An outer layer of the system is responsible for updating the state and performing necessary I/O. In a sense, this turns all interactions in the system into a request/response model that has clear failure boundaries, as shown with web servers. In practice, I find this very difficult to pull off properly when there are complex data flows.

The “functional core” part of the functional core/imperative shell design only applies to the core business logic (compare the Onion Architecture). This fits reasonably well with applications that already follow an MVC pattern or similar. Thus, the usage of GUI frameworks like Swing or the concept of background tasks does not pre-empt such an architecture.

Of course, many real-world applications do not keep a strong distinction between business logic and the GUI which makes it difficult to insert a “functional core”. But such applications are already a tangled mess, and there's no chance of creating error boundaries in any systematic way. At most, individual operations that are known-safe to fail can be handled.

Related Topic