To maintain SOLID, should data preparation, conversion, and pre-computation for purposes of saving an object, be separate from data persistence layer

layersobject-orientedPHPseparation-of-concernssolid

I am facing a common situation where I am saving some values into database from a business object. I am using a relational database and usually I only need to save a few items that are part of the object.

To clarify, my business object is composed of other sub-objects, or some core variables (float, string, int). It contains user input, result of various computations, data read from database, some data formatting.

Before saving parts of the object to database, I first need to prepare the data of the object. Preparing can mean:

  • add weights of sub-components of an assembly (i.e. physical counterparts can be motor & pump weights)
  • convert floats to ints
  • format values and add units, i.e. 8.0292542 => 8.03 mm
  • run any more involved computations, which could be parts of other service classes. Extract variables from returned result

I call the above computations, conversions, formatting, etc parts of the "Service Layer".
And raw database SQL code as part of the "Repository Layer"

My question is … should those layers always remain separate, or are there cases where they can be merged?

Arguments for merging Service code into the Repository

One high-level argument I can provide for merging them here is — I am saving the business object into the database. I can call a single Repository layer method and in that method I can do both the preparation of the data and actual SQL-inserting the data into the database. Any such preparation of data is needed and required before inserting it into the database, and not for any other cause, and hence, it should be part of the Repository Layer.

Arguments for keeping Service code and Repository code separate

Repository layer should concert itself with SQL-level and database-level concerns. It doesn't care, and shouldn't care about how some high-level business object is converted, formatted, computed, and formed. Hence, any of these conversions and object-manipulation before saving it to database are to live and remain in a separate class, separately in a Service Layer.

Then, final results can be passed to the Repository layer, i.e. in array form, ready for the Repository layer to just take them and directly insert them into the prepared SQL statement.

Gray Areas

I could also make a hazy argument that "it depends". Are there service-related tasks that are so removed from preparation of object for insertion into database, that they indeed must be in a separate layer? And are there service-related tasks that are so close that they must be included into the Repository layer? I think here is where I need some help figuring that out as well.

Sample Code

This is a crude example as it doesn't quite capture the complexities of objects I work with, but it illustrates the above. First, suppose you have some business entity class, i.e. Point:

class Point
{
    private float $x;
    
    private float $y;
    
    public function __construct(float $x, float $y)
    {
        $this->x = $x;
        $this->y = $y;
    }
    
    public function getX(): float
    {
        return $this->x;
    }
    
    public function getY(): float
    {
        return $this->y;
    }
}

Exhibit 1: Separate Service and Repository layers

class Service
{
    function getDistance(Point $a, Point $b): float
    {
        return sqrt(pow($b->getX() - $a->getX(), 2) + pow($b->getY() - $a->getY(), 2));
    }
}

class Repository
{
    function saveDistance(float $distance): void
    {
        print "insert $distance into table" . PHP_EOL;
    }
}

$service = new Service();
$repository = new Repository();

$a = new Point(1, 1);
$b = new Point(2, 2);

// Here we have separate layers, and distance is passed to Repository layer to get saved.
$distance = $service->getDistance($a, $b);
$formatted_distance = number_format($distance, 2);
$repository->saveDistance($formatted_distance);

Exhibit 2: Repository layer includes any Service related code

class Repository
{
    function saveDistance(Point $a, Point $b): void
    {
        $distance = sqrt(pow($b->getX() - $a->getX(), 2) + pow($b->getY() - $a->getY(), 2));
        // Alternatively I can reuse the service as part of Repository layer
        // $distance = $this->service->getDistance($a, $b);
        $formatted_distance = number_format($distance, 2);
        print "insert $formatted_distance into table" . PHP_EOL;
    }
}

$a = new Point(1, 1);
$b = new Point(2, 2);

// here we pass business object(s) to Repository layer and do any pre-computation, conversion there
$repository = new Repository();
$repository->saveDistance($a, $b);

Here while $a and $b are different objects, in reality both points can be a part of some larger objects, so any interactions between the two points here, can be seen as interactions within one larger object.

Another way to ask my question could be "Is it okay to have Repository Layer be dependent on Service Layer", i.e. $repository = new Repository(new Service()); where any service related tasks are taken care of inside the repository layer.

Best Answer

Whether your repo and your service layer should be separated or not is nothing you decide by just looking at the code and what is does. You decide this by asking yourself if the layers are ever going to be maintained, tested, evolved, used, or generated separately - or if it is quite unlikely that this will happen.

Some examples:

  • when you expect the service functions to be maintained by one team, whilst the repo functions are maintained by another, or

  • when your repository functions can be generated from some high level metadata, but your service layer must be written manually

  • when you expect different repositories for different types of storages, but all of them will use the same service functions

  • when your services are so complex that it makes sense to test them in isolation, without the repo involved, or vice versa

  • when you want to use / reuse service functions (for example, getDistance) without producing a dependency on the repository functions

then it is better to separate the services and repos into different layers. If not, you can most probably save the hassle and leave everything in one layer. Note this is not necessarily an either-or decision: maybe after looking at the list above, you decide to put parts of your current service functions into a separate layer, and other parts into the repository, since those are exclusively required by the persistence mechanism and tightly couple to it.

Bob Martin himself once wrote

And this gets to the crux of the Single Responsibility Principle. This principle is about people.

This is essentially what I am trying to tell here: separating responsibilities (for example, by layering) is not an end in itself, it is a means to an end. You separate two things if you need separate maintainability or evolvability. But if functionality is so much coupled together that separation makes changes harder and error prone, then it will probably be better not to separate.