How large is your application? There's a good chance you're overthinking this. Unless the application is a large, enterprise-grade application, the high degree of loose coupling you are advocating is probably unnecessary.
If you decide that this level of loose coupling is still necessary, create a service layer that returns "service" objects. This fulfills a similar function to View Models in MVC. You will then write code in your service layer that maps objects from your domain model to your service model.
Note that part of your struggle may be due to the fact that, while your Data Repository is returning CRUD objects, your Service Layer should be returning the result of actions. For example:
public InvoiceView GetInvoice(int invoiceID);
returns a InvoiceView
object containing whatever data you wish to expose to the public, including Name, Address, Line Items and so forth, from several different tables/objects in your domain.
public class InvoiceView
{
public int InvoiceID;
public Address ShippingAddress;
public Address BillingAddress;
public List<LineItem> LineItems;
...
}
Similarly, there will be Service Layer methods that simply performs an action and returns an object indicating the result of the action:
public TransactionResult Transfer(
int sourceAccountID, int targetAccountID, Money amount, ValidationToken token);
IMPORTANT: You'll never be able to completely decouple from your data. Your consumer will always have to have some knowledge of the data, even if it's just an object ID or UUID.
Ideally, you only want your Gateway class to know what persistence back-end it is talking to, and all the other parts should be agnostic to the back-end. Unfortunately, you are finding that, given this particular implementation, the Factory class cannot be agnostic because it needs to know what type of object
it is getting.
You mentioned that you can't just return a DataTable
, etc. because a non-database persistence back-end wouldn't use these structures. Can we instead pass an agnostic representation out of the Gateway? Let's look at some options:
Return a DataTable
. If you are only going to use database-backed persistence, just do this. It handles a lot of the details for you while being generic across database engines. You don't want to do this because you are worried that a non-database back-end doesn't use this.
Return a simple, custom DTO object from the Gateway. For the IBlogGateway
, it might return a BlogDTO
object with just data members. This means more classes, but you can return an object with all the correct types already in place. This also creates a dependency on the DTO class in all of the other classes, but this dependency is reasonable since they all already deal with Blog
and BlogDTO
will mirror that dependency.
Return a generic container that can be created from any type of persistence. For example, we can convert the DataTable
to an IEnumberable<Dictionary<string, object>>
, where the enumerable holds the database rows (or web service results, etc.) represented as Dictionary
s mapping keys to values. We lose some type information since all the values are object
s, but DataRow
has a similar limitation that seems otherwise acceptable. This is comparable with the way dynamic languages return results (such as PHP returning an array of associative arrays).
I want to also point out (as @Ewan noted in a comment to the question) that the Gateway, Factory, and Repository don't strictly need to be distinct classes/interfaces. A repository can fill the role of all three, especially when starting out. Later, we can refactor out parts as needed. Even if the gateway is factored out, often the factory and repository can be the same class since they have closely related responsibilities (creating objects of a type).
Contrarily, separating the classes out provides the opportunity for composition, allowing (for example) the gateway to vary in isolation from the rest of the code. Different gateway implementations, corresponding to different back-ends, can be composed in at runtime (and test time) for greater flexibility.
Best Answer
Theoretically, every layer (= project in your solution) should have its own DTO objects. In that sense, your repositories should return a DTO, but this is not the same DTO as the "business logic DTO".
However, in reality, we don't need that much separation. The benefits do not outweigh the effort. In most cases, it suffices to have a single entity-DTO conversion, which tends to happen in the business layer (BLL), which is one layer above the repository layer (DAL).
In that sense, your repositories should not return a DTO. But your business classes should.
The issue here is one of realism. The theory simply can't be put into practice.
Theoretically, repositories are designed to be entity-type-specific data providers. In other words, you get
A
from theARepository
andB
from theBRepository
.However, when dealing with external storage resources (database server), the effort needed to retrieve data is non-negligible. This creates an issue for us. For every entity type we wish to retrieve, we need a separate call via its own repository.
Sidenote: If you're either not dealing with an external data resource or you're not trying to run single queries which will return objects of multiple types, the continuation of this answer is irrelevant to you, as you'll be perfectly happy with getting each type from its own repository.
Once you're dealing with more than one entity type in a request, separating these into separate retrieval queries is counterproductive. While it does create a clean code structure, it dramatically impacts performance. This becomes doubly egregious when you realize that a SQL database server is specifically optimized for joining (by using indexes), but the code that calls the database (repository) is somehow incapable of implementing that same graceful data mixing approach.
Using repositories has becoming an antipattern here. The obstacles you encounter are caused by us deciding to use repositories. And a "solution" that creates a bigger obstacle than it aims to solve is not a solution.
A unit of work solves a lot of the problems that repositories introduce in terms of transational safety (having all repositories use the same context.
However, a unit of work does not help with deciding where to write a complex data query (
ARepository
?BRepository
? ...)This is a tough choice. In a way, repositories are really nice, especially in one-entity-type contexts (e.g. simple CRUD functionality). On the other hand, it massively complicates complex data retrieval.
I haven't encountered a universally agreed upon solution for this. But I have worked on several project where one (or more) agreements were struck to at least keep the location of the code (especially for complex data queries) somewhat sensical.
It's always good to implement a unit of work.
This is just a list of agreements I've encountered over several projects I've worked at. I can't really put one over the other
Complex data queries are put into the repository of their main entity type.
CarRepository
.PersonRepository
.Pro You can keep the old repository structure but at least make it less of a guessing game where to put the code.
Con There are fringe cases where there is no clear "main" entity type.
A
andB
objects, you should have anABRepository
which internally usesARepository
andBRepository
(a unit of work is incredibly important here!).AB
,AC
,ABC
,ACD
, ...) the list of repositories is going to grow out of bounds.YearlyReportsRepository
) and not just their aggregate list of entity types (PersonCarRepository
).If you choose option 1, that question has an easy answer.
CarRepository
.In other words, it returns a
IEnumerable<Car>
and therefore still follows the idea that "aCarRepository
returns cars" (and possibly some related entities, but they are not explicitly part of the return type).PersonRepository
.In other words, it returns a
IEnumerable<Person>
and therefore still follows the idea that "aPersonRepository
returns people" (and possibly some related entities, but they are not explicitly part of the return type).If you choose option 2, then you're implicitly arguing that a complex data report is something different from a simple CRUD operation. This means that you'd probably end up returning custom classes (
CarOwnerResult
) from the custom repositories (= not bound to a single entity type).Note that, if you want to, simple repositories (bound to a single entity type) can still be expected to return only their bound entity type, just like before.