Object-Oriented Architecture – APIs and Functional Programming

Architecturefunctional programmingobject-oriented

From my (admittedly limited) exposure to functional programming languages, such as Clojure, it seems that encapsulation of data has a less important role. Usually various native types such as maps or sets are the preferred currency of representing data, over objects. Furthermore, that data is generally immutable.

For example, here's one of the more famous quotes from Rich Hickey of Clojure fame, in an interview about the matter:

Fogus: Following that idea—some people are surprised by the fact that Clojure does not engage in data-hiding encapsulation on its types. Why did you decide to forgo data-hiding?

Hickey: Let’s be clear that Clojure strongly emphasizes programming to abstractions. At some point though, someone is going to need to have access to the data. And if you have a notion of “private”, you need corresponding notions of privilege and trust. And that adds a whole ton of complexity and little value, creates rigidity in a system, and often forces things to live in places they shouldn’t. This is in addition to the other losing that occurs when simple information is put into classes. To the extent the data is immutable, there is little harm that can come of providing access, other than that someone could come to depend upon something that might change. Well, okay, people do that all the time in real life, and when things change, they adapt. And if they are rational, they know when they make a decision based upon something that can change that they might in the future need to adapt. So, it’s a risk management decision, one I think programmers should be free to make. If people don’t have the sensibilities to desire to program to abstractions and to be wary of marrying implementation details, then they are never going to be good programmers.

Coming from the OO world, this seems to complicate some of the enshrined principles I've learned over the years. These include Information Hiding, the Law of Demeter and Uniform Access Principle, to name a few. The common thread being that encapsulation allows us to define an API for others to know what they should and shouldn't touch. In essence, creating a contract that allows for the maintainer of some code to freely make changes and refactorings without worrying about how it might introduce bugs into the consumer's code (Open/Closed principle). It also provides a clean, curated interface for other programmers to know which tools they can use to get at or build upon that data.

When the data is allowed to be directly accessed, that API contract is broken and all those encapsulation benefits seem to go away. Also, strictly immutable data seems to make passing around domain-specific structures (objects, structs, records) much less useful in the sense of representing a state and the set of actions that can be performed on that state.

How do functional codebases address these issues that seem to come up when the size of a codebase grows enormous such that APIs need to be defined and lots of developers are involved on working with specific parts of the system? Are there examples of this situation available that demonstrate how this is handled in these type of codebases?

Best Answer

First of all, I'm going to second Sebastian's comments on what is functional proper, what is dynamic typing. More generally, Clojure is one flavor of functional language and community, and you shouldn't generalize too much based on it. I'll make some remarks from more of an ML/Haskell perspective.

As Basile mentions, the concept of access control does exist in ML/Haskell, and is often used. The "factoring" is a bit different from conventional OOP languages; in OOP the concept of a class plays simultaneously the role of type and module, whereas functional (and traditional procedural) languages treat these orthogonally.

Another point is that ML/Haskell are very heavy on generics with type erasure, and that this can be used to provide a different flavor of "information hiding" than OOP encapsulation. When a component only knows the type of a data item as a type parameter, that component can be safely handed values of that type, and yet it will be prevented from doing much with them because it doesn't know and cannot know their concrete type (there's no universal instanceof or runtime casting in these languages). This blog entry is one of my favorite introductory examples to these techniques.

Next: in the FP world it's very common to use transparent data structures as interfaces to opaque/encapsulated components. For example, interpreter patterns are very common in FP, where data structures are used as syntax trees that describe logic, and fed to code that "executes" them. State, properly said, then exists ephemerally when the interpreter runs that consumes the data structures. Also the interpreter's implementation can change as long as it still communicates with the clients in terms of the same data types.

Last and longest: encapsulation/information hiding is a technique, not an end. Let's think a bit about what it provides. Encapsulation is a technique for reconciling the contract and the implementation of a software unit. The typical situation is this: the system's implementation admits of values or states that, according to its contract, should not exist.

Once you look at it this way, we can point out that FP provides, in addition to encapsulation, a number of additional tools that can be used to the same end:

  1. Immutability as the pervasive default. You can hand transparent data values to third party code. They cannot modify them and put them into invalid states. (Karl's answer makes this point.)
  2. Sophisticated type systems with algebraic data types that allow you to finely control the structure of your types, without writing lots of code. By judiciously using these facilities you can often design types where "bad states" are just impossible. (Slogan: "Make illegal states unrepresentable.") Instead of using encapsulation to indirectly control the set of admissible states of a class, I'd rather just tell the compiler what those are and have it guarantee them for me!
  3. Interpreter pattern, as mentioned already. One key to designing a good abstract syntax tree type is to:
    • Try and design the abstract syntax tree data type so that all values are "valid."
    • Failing that, make the interpreter explicitly detect invalid combinations and cleanly reject them.

This F# "Designing with types" series makes for pretty decent reading on some of these topics, particularly #2. (It's where the "make illegal states unrepresentable" link from above comes from.) If you look closely, you'll note that in the second part they demonstrate how to use encapsulation to hide constructors and prevent clients from constructing invalid instances. As I said above, it is part of the toolkit!