R – Designing the Data Access Layer

Architecturedata-access-layer

I'm facing a design issue regarding how to design my DAL.
As we all know, in its most basic definition, the DAL means the layer
that is responsible for communicating with some data repository (of course I'm not talking about the Repository pattern),
usually a database. Now this is where the catch is.
Some of our business objects would have to get their data from the database, and some would get their data from other sources, i.e web services.
A few of our members on the team suggested that the BO's should be smart enough about knowing whether to call a DAL (that only knows to talk to the database)
or call the required web service. Others suggested that this might not be an optimal solution, suggesting that everything should pass through the DAL, where it would contain let's say adapters, or whatever, for each data retrieval method.

How would you architect a system with such data access needs?
Is any of the suggested solutions might be good enough for the long run (the 2nd one might take more time to develop)
or do we need to take a totally different approach? Perhaps there is a design pattern that suits this kind of issue…

Thanks,
Avi Shilon

Best Answer

I would strongly recommend the second approach. The business logic should not know anything about the source of data.

When it does not know, in addition to the usual benefits (easier maintainability due to isolation and cleaner design), you also have the flexibility (depending on how well your DAL is designed) to:

  • As your minimum requirement states, retrieve data from varied data sources

  • Retrieve data from a prioritized set of data sources to achieve fail-over.

    E.g. you get the latest price quotes from Reuters real time quote service, but when that breaks down due to WAN issues, you fall back on an alternate service, or older prices cached in the database.

    Obviously, the data sources are ordered in the priority of non-increasing quality and non-decreasing reliability.

  • Retrieve data from a prioritized set of data sources to achieve caching

    E.g. retrieve a price from a local cache, if missing, retrieve from a local database, if missing, request from vendor's service.

Also, just to give a flesh-and-bone example of easier maintainability, if your data source changes from a realtime queried vendor service to in-house gold copy database populated by a push feed from the vendor, you would only need to change the DAO instead of every one of multitude of BOs that need the data. Easier to change, and way safer to test and deploy the change.

Related Topic