Repository Data Return – Best Practices in ASP.NET

asp.netdtoonion-architecturerepository

I have a simple project where the controller calls the service, and the service calls the repository in order to get the needed data.

Assuming that we have these domain models:

// this model has a RepositoryA built for it
Class A {
   public string Name { get; set; }
   public int B_Id { get; set; }
}

// this model has a RepositoryB built for it
Class B {
   public string Age{ get; set; }
}

And this DTO:

Class MyDto {
   public string Name { get; set; }
   public string Age{ get; set; }
}

And knowing (correct me if I'm wrong) that a repository shouldn't return a DTO, and that the query must be done in a RepositoryA where we have to select both models A & B in order to build a return that contains a mix of properties (complex request with multiple joins on different tables in the real project), what model should the repository return if not the DTO ?? knowing that the DTO is already prepared to receive the mix of properties?

Best Answer

And knowing (correct me if I'm wrong) that a repository shouldn't return a DTO

Theoretically, every layer (= project in your solution) should have its own DTO objects. In that sense, your repositories should return a DTO, but this is not the same DTO as the "business logic DTO".

However, in reality, we don't need that much separation. The benefits do not outweigh the effort. In most cases, it suffices to have a single entity-DTO conversion, which tends to happen in the business layer (BLL), which is one layer above the repository layer (DAL).
In that sense, your repositories should not return a DTO. But your business classes should.

and that the query must be done in a RepositoryA where we have to select both models A & B in order to build a return that contains a mix of properties (complex request with multiple joins on different tables in the real project)

The issue here is one of realism. The theory simply can't be put into practice.

Theoretically, repositories are designed to be entity-type-specific data providers. In other words, you get A from the ARepository and B from the BRepository.

However, when dealing with external storage resources (database server), the effort needed to retrieve data is non-negligible. This creates an issue for us. For every entity type we wish to retrieve, we need a separate call via its own repository.

Sidenote: If you're either not dealing with an external data resource or you're not trying to run single queries which will return objects of multiple types, the continuation of this answer is irrelevant to you, as you'll be perfectly happy with getting each type from its own repository.

Once you're dealing with more than one entity type in a request, separating these into separate retrieval queries is counterproductive. While it does create a clean code structure, it dramatically impacts performance. This becomes doubly egregious when you realize that a SQL database server is specifically optimized for joining (by using indexes), but the code that calls the database (repository) is somehow incapable of implementing that same graceful data mixing approach.

Using repositories has becoming an antipattern here. The obstacles you encounter are caused by us deciding to use repositories. And a "solution" that creates a bigger obstacle than it aims to solve is not a solution.

A unit of work solves a lot of the problems that repositories introduce in terms of transational safety (having all repositories use the same context.

However, a unit of work does not help with deciding where to write a complex data query (ARepository? BRepository? ...)

This is a tough choice. In a way, repositories are really nice, especially in one-entity-type contexts (e.g. simple CRUD functionality). On the other hand, it massively complicates complex data retrieval.

I haven't encountered a universally agreed upon solution for this. But I have worked on several project where one (or more) agreements were struck to at least keep the location of the code (especially for complex data queries) somewhat sensical.

It's always good to implement a unit of work.

This is just a list of agreements I've encountered over several projects I've worked at. I can't really put one over the other

Complex data queries are put into the repository of their main entity type.
- A query which retrieves a list of cars and includes their owners belongs to the CarRepository.
- A query which retrieves a list of people and includes the cars they own belongs to the PersonRepository.
- Pro You can keep the old repository structure but at least make it less of a guessing game where to put the code.
- Con There are fringe cases where there is no clear "main" entity type.
Complex data queries are put into their own repository. "Old" repositories that focus on a single entity type should only be used in CRUD operations.
- If you have a method that saves A and B objects, you should have an ABRepository which internally uses ARepository and BRepository (a unit of work is incredibly important here!).
- Pro it separates the CRUD logic from the reporting logic.
- Con If you have many combinations (AB, AC, ABC, ACD, ...) the list of repositories is going to grow out of bounds.
- The suggestion to prevent the "con" is to name these repositories after their function (YearlyReportsRepository) and not just their aggregate list of entity types (PersonCarRepository).

what model should the repository return if not the DTO ??

If you choose option 1, that question has an easy answer.

A query which retrieves a list of cars and includes their owners belongs to the CarRepository.

In other words, it returns a IEnumerable<Car> and therefore still follows the idea that "a CarRepository returns cars" (and possibly some related entities, but they are not explicitly part of the return type).

A query which retrieves a list of people and includes the cars they own belongs to the PersonRepository.

In other words, it returns a IEnumerable<Person> and therefore still follows the idea that "a PersonRepository returns people" (and possibly some related entities, but they are not explicitly part of the return type).

If you choose option 2, then you're implicitly arguing that a complex data report is something different from a simple CRUD operation. This means that you'd probably end up returning custom classes (CarOwnerResult) from the custom repositories (= not bound to a single entity type).

Note that, if you want to, simple repositories (bound to a single entity type) can still be expected to return only their bound entity type, just like before.

Related Solutions

C# – Confused on how to properly employ a Repository Pattern with Service/Business Layer on top

How large is your application? There's a good chance you're overthinking this. Unless the application is a large, enterprise-grade application, the high degree of loose coupling you are advocating is probably unnecessary.

If you decide that this level of loose coupling is still necessary, create a service layer that returns "service" objects. This fulfills a similar function to View Models in MVC. You will then write code in your service layer that maps objects from your domain model to your service model.

Note that part of your struggle may be due to the fact that, while your Data Repository is returning CRUD objects, your Service Layer should be returning the result of actions. For example:

public InvoiceView GetInvoice(int invoiceID);

returns a InvoiceView object containing whatever data you wish to expose to the public, including Name, Address, Line Items and so forth, from several different tables/objects in your domain.

public class InvoiceView
{
    public int InvoiceID;
    public Address ShippingAddress;
    public Address BillingAddress;
    public List<LineItem> LineItems;
    ...
}

Similarly, there will be Service Layer methods that simply performs an action and returns an object indicating the result of the action:

public TransactionResult Transfer(
    int sourceAccountID, int targetAccountID, Money amount, ValidationToken token);

IMPORTANT: You'll never be able to completely decouple from your data. Your consumer will always have to have some knowledge of the data, even if it's just an object ID or UUID.

Repository Pattern – Best Data Type for Gateway Return to Avoid Refactoring

Ideally, you only want your Gateway class to know what persistence back-end it is talking to, and all the other parts should be agnostic to the back-end. Unfortunately, you are finding that, given this particular implementation, the Factory class cannot be agnostic because it needs to know what type of object it is getting.

You mentioned that you can't just return a DataTable, etc. because a non-database persistence back-end wouldn't use these structures. Can we instead pass an agnostic representation out of the Gateway? Let's look at some options:

Return a DataTable. If you are only going to use database-backed persistence, just do this. It handles a lot of the details for you while being generic across database engines. You don't want to do this because you are worried that a non-database back-end doesn't use this.
Return a simple, custom DTO object from the Gateway. For the IBlogGateway, it might return a BlogDTO object with just data members. This means more classes, but you can return an object with all the correct types already in place. This also creates a dependency on the DTO class in all of the other classes, but this dependency is reasonable since they all already deal with Blog and BlogDTO will mirror that dependency.
Return a generic container that can be created from any type of persistence. For example, we can convert the DataTable to an IEnumberable<Dictionary<string, object>>, where the enumerable holds the database rows (or web service results, etc.) represented as Dictionarys mapping keys to values. We lose some type information since all the values are objects, but DataRow has a similar limitation that seems otherwise acceptable. This is comparable with the way dynamic languages return results (such as PHP returning an array of associative arrays).

I want to also point out (as @Ewan noted in a comment to the question) that the Gateway, Factory, and Repository don't strictly need to be distinct classes/interfaces. A repository can fill the role of all three, especially when starting out. Later, we can refactor out parts as needed. Even if the gateway is factored out, often the factory and repository can be the same class since they have closely related responsibilities (creating objects of a type).

Contrarily, separating the classes out provides the opportunity for composition, allowing (for example) the gateway to vary in isolation from the rest of the code. Different gateway implementations, corresponding to different back-ends, can be composed in at runtime (and test time) for greater flexibility.

Best Answer

Related Solutions

C# – Confused on how to properly employ a Repository Pattern with Service/Business Layer on top

Repository Pattern – Best Data Type for Gateway Return to Avoid Refactoring

Related Topic