This sounds suspiciously like you're trying to invent output caching. I definitely wouldn't agree that it's hard to do partial page caching in MVC, it's just that you would need to use partial Views with their own Controller actions to do it as opposed to having the View itself call its sub-Views, which can be slightly counter-intuitive. (So in ASP.net MVC you'd want to use Html.RenderAction as opposed to Html.RenderPartial). There's a name for this particular pattern that is currently escaping my recollection.
I would suggest that the main flaw with your design is that Views will have to know things about the architecture of the site. So they'll have to know about where to get their data from AND how to get it, they'll have to know about where to cache it, when to cache it, when not to cache it, etc.
Realistically you should be trying to separate layers away from knowledge of other layers, as the less knowledge each layer has of another the easier it is to change a layer (i.e. switch DB, add a transparent data caching layer, move a DB call across to a web service, etc.).
I would suggest that if you're going to implement such an idea, and there isn't a native output caching system in your MVC-framework-of-choice, that you add the caching layer to the Controller actions. Unless you've got some VERY heavyweight Views that need to do large amounts of recursive rendering of Models (something that's very rare) the actual HTML generation time is miniscule compared to DB calls and client network latency, so take a more pragmatic approach and cache where you need to cache, i.e. in the application layer.
If you really need custom output caching then you probably want to just slip in a caching layer above your Controller actions with either a wrapping class (if you're using a dynamic language) or a different implementation of an interface (if in a static language) that can hijack the calls to the Action and react accordingly.
As for caching that reacts to DB changes, you'd be better off with a caching layer that takes into account both loading and saving in your repository class, with each save call flushing the cache (or part of the cached set) and each load being gracefully degraded from cache to DB when needed (i.e. use the cache when the cached data is available). That way you keep the database driven behaviour close to the database.
It seems what you need is a wrapper for all the parameters that define a page (say, pageNumber
, pageSize
, sortType
, totalCount
, etc.) and use this DataRequest
object as the key for your caching mechanism. From this point you have a number of options to handle the cache:
- Implement some sort of timeout mechanism to refresh the cache (based on how often the data changes).
- Have a listener that checks database changes and updates the cache based the above parameters.
- If the changes are done by the same process, you can always mark the cache as outdated with every change and check this flag when a page is requested.
The first two might involve a scheduler mechanism to trigger on some interval or based on an event. The last one might be the simpler if you have a single data access point.
Lastly, as @DanPichelman mentioned, it can quickly become an overly complicated algorithm that outweighs the benefits, so be sure the gain in performance justify the complexity of the algorithm.
Best Answer
This kind of memory management: telling the cpu (in advance) what content is frequently accessed, is really hard to do for a wide array of programming problems, where data structures involve pointers and such.
Yet it is (by comparison) easier to do for certain parallel sliced algorithms, such as found in the graphics domain. In the graphics domain, you're dealing with large chunks of contiguous (numeric) data and vastly fewer pointers.
So, modern CPUs opt to do their cache management automatically, using multi-level caches that ultimately ends with disc-based memory. Each level of the cache is noticing how often some cached portion of memory is used, and uses that information when it decides to evict something from that level of the cache. Each level has a different "page" size (called line size on the upper levels).
So, there's virtually no way for a programmer to inform the CPU of what to keep and what to evict, because of the combination of multi-level and varying cache line/page sizes at each level. Ok, so that's bad enough, but now, throw in that the same program wants to run on multiple different cpus of different performance (where much of that performance difference comes by increasing cache sizes, number of cache levels, etc..), and, then this becomes an intractable problem for the programmer dealing with general purpose algorithms and data structures.
What programmer can do, then, instead of informing the cpu what to keep/evict, is attempt to co-locate related items (e.g. A and B) so that across all the possible variations of cpus and multi-level caches, if A is in the cache, then so is B. (There are other things programmers can attempt to keep programs cache friendly, you can google "cache friendly" data structure or algorithms.)
Another difference is that the GPU memory is separated from the CPU memory, so programming the GPU necessarily involves moving memory back and forth. Whereas the CPU has cache misses and page faults that automatically load memory that is not close to the CPU, the GPU (historically) doesn't have these mechanisms and the programmers of the GPU have to constantly instruct the GPU to copy memory back and forth between GPU memory and CPU memory. This has been and is ever increasingly a problem as we use GPUs for more problem solving, so eventually we'll see more and more hardware breaking down the barrier between CPU memory and GPU memory, resulting in unification at higher levels of the cache hierarchy.