Encapsulation – Is It Necessary If Immutability Is Ensured?

encapsulationfunctional programmingimmutabilityobject-oriented

Encapsulation

In object-oriented programming (OOP), encapsulation refers to the
bundling of data with the methods that operate on that data, or the
restricting of direct access to some of an object's components.1
Encapsulation is used to hide the values or state of a structured data
object inside a class, preventing unauthorized parties' direct access
to them. Wikipedia – Encapsulation (Computer Programming)

Immutability

In object-oriented and functional programming, an immutable object (unchangeable object) is an object whose state cannot be modified after it is created.Wikipedia – Immutable object

If you can guarantee immutability, do you need to think about encapsulation?

I have seen these concepts being used in explaining ideas in object-oriented programming (OOP) and functional programming (FP).

I tried to investigate on the topics of encapsulation, immutability and their relation to one another. I couldn't find a post that explicitly asked if encapsulation is guaranteed if you have immutability.

Please correct me if I have misunderstood anything on the topic of encapsulation or immutability. I wish to understand these concepts better. Also, direct me to any other posts that have been done on the topic which answers the question above.

Best Answer

The question

Casting your question to real life:

Is it okay for your doctor to post your private medical records publicly to Facebook, provided no one (other than you) is able to change it?

Is it okay for me to let strangers in your house, provided they can't steal or damage anything?

It's asking the same thing. The core assumption of your question is that the only concern with exposing data is that it can be changed. As per your reference material:

Encapsulation is used to hide the values or state of a structured data object inside a class, preventing unauthorized parties' direct access to them.

The ability to change values or state is definitely the biggest concern, but it's not the only concern. "Direct access" entails more than just write access. Read access can be a source of weakness as well.

A simple example here is that you are generally advised to not show stacktraces to an end user. Not just because errors shouldn't occur, but because stacktraces sometimes reveal specific implementations or libraries, which leads to an attacker knowing about the internal structure of your system.

The exception stacktrace is readonly, but it can be of use to those who wish to attack your system.

Edit
Due to confusion mentioned in the comments, the examples given here are not intended to suggest that encapsulation is used for data protection (such as medical records).
This part of the answer so far has only addressed the core assertion that your question is built upon, i.e. that read access without write access is not harmful; which I believe to be incorrect, hence the simplified counterexamples.


Encapsulation as a safety guard

Additionally, in order to prevent write access, you would need to have immutability all the way to the bottom. Take this example:

public class Level1
{
    public string MyValue { get; set; }
}

public class Level2 // immutable
{
    public readonly Level1 _level1;

    public Level2(Level1 level1) { _level1 = level1; }
}

public class Level3 // immutable
{
    public readonly Level2 _level2;

    public Level3(Level2 level2) { _level2 = level2; }
}

We've let Level2 and Level3 expose their readonly fields, which is doing what your question is asserting to be safe: read access, no write access.

and yet, as a consumer of a Level3 object, I can do this:

// fetch the object - this is allowed behavior
var myLevel3 = ...; 

// but this wasn't the intention!
myLevel3.Level2.Level1.MyValue = "SECRET HACK ATTACK!";

This code compiles and runs perfectly fine. Because read access on a field (e.g. myLevel3.Level2) gives you access to an object (Level2) which in turn exposes read access to another object (Level1), which in turn exposes read and write access to its MyValue property.

And this is the danger of brazenly making everything immutably public. Any mistake will be visible and become an open door for unwanted behavior. By needlessly exposing some things that could easily have been hidden, you have opened them up to scrutiny and abuse of weakness if any exists.

Edit
Caleth mentioned that a class is not immutable if it exposes something that itself is not immutable. I think that this is a semantical argument. Level2's properties are readonly, which ostensibly makes it immutable.

To be fair, if the law of Demeter had been followed in my example, the issue wouldn't have been as glaring since Level2 wouldn't expose direct access to Level1 (but that precludes the issue I was trying to highlight); but the point of the matter is that it's a fool's errand to try and ensure the immutability of an entire codebase. If someone makes one adjustment in a single class (that a lot of other classes depend on in some way), that could lead to an entire assembly worth of classes becoming mutable without anyone noticing it.

This issue can be argued to be a cause of a lack of encapsulation or not following the law of Demeter. Both contribute to the issue. But regardless of what you attribute it to the fact remains that the this is unmistakably a problem in the codebase.


Encapsulation for clean code

But that's not all you use encapsulation for.

Suppose my application wants to know the time, so I make a Calendar which tells me the date. Currently, I read this date as a string from a file (let's assume there is a good reason for this).

public class Calendar
{
    public readonly string fileContent; // e.g. "2020-01-28"

    public DateTime Date => return DateTime.Parse(fileContent);

    public Calendar()
    {
        fileContent = File.ReadAllText("C:\\Temp\\calendar.txt");
    }
}

fileContent should have been an encapsulated field, but I've opened it up because of your suggestion. Let's see where that takes us.

Our developers have been using this calender. Let's look at Bob's library and John's library:

public class BobsLibrary
{
    // ...

    public void WriteToFile(string content)
    {
        var filename = _calendar.fileContent + ".txt"; // timestamp in filename
        var filePath = $"C:\\Temp\\{filename}";

        File.WriteAllLines(filePath , content);
    }
}

Bob has used Calendar.fileContent, the field that should've been encapsulated, but wasn't. But his code works and the field was public after all, so there's no issue right now.

public class JohnsLibrary
{
    // ...

    public void WriteToFile(string content)
    {
        var filename = _calendar.Date.ToString("yyyy-MM-dd") + ".txt"; // timestamp in filename
        var filePath = $"C:\\Temp\\{filename}";

        File.WriteAllLines(filePath , content);
    }
}

John has used Calendar.Date, the property that should always be exposed. At first glance, you'd think John is doing unnecessary work by converting the string to a DateTime and back to a string. But his code does work, so no issue is raised.

Today, we have learned something that will save us a lot of money: you can get the current date from the internet! We no longer have to hire an intern to update our calendar file every midnight. Let's change our Calendar class accordingly:

public class Calendar
{
    public DateTime Date { get; }

    public Calendar()
    {
        Date = GetDateFromTheInternet("http://www.whatistodaysdate.com");
    }
}

Bob's code has broken! He no longer has access to the fileContent, since we're no longer parsing our date from a string.

John's code, however, has kept working and does not need to be updated. John used Date, the intended public contract for the calendar. John did not build his code to rely on implementation details (i.e. the fileContent from which we parsed the date in the past), and therefore his code can effortlessly handle changes to the implementation.

This is why encapsulation matters. It allows you to disconnect your consumers (Bob, John), from your implementation (the calendar file) by having an intermediary interface (the DateTime Date). As long as the intermediary interface is untouched, you can change the implementation without affecting the consumers.

My example is a bit simplified, you'd more likely use an interface here and swap out the concrete class that implements the interface for another class that implements the same interface. But the issue I pointed out remains the same.