Java Distributed Computing – Synchronizing Local and Remote Cache

cachingdistributed computingjava

With a distributed cache, a subset of the cache is kept locally while the rest is held remotely.

  • In a get operation, if the entry is not available locally, the remote cache will be used and and the entry is added to local cache.
  • In a put operation, both the local cache and remote cache are updated. Other nodes in the cluster also need to be notified to invalidate their local cache as well.

What's a simplest way to achieve this if you implemented it yourself, assuming that nodes are not aware of each other.

Edit
My current implementation goes like this:

  • Each cache entry contains a time stamp.
  • Put operation will update local cache and remote cache
  • Get operation will try local cache then remote cache
  • A background thread on each node will check remote cache periodically for each entry in local cache. If the timestamp on remote is newer overwrite the local. If entry is not found in remote, delete it from local.

Best Answer

The problem you may want to focus is when to send local caches messages carrying updates of remote caches. On one hand, you could send a message for each changed item, to all other caches. This ensure timely updates but there can be lots of update messages. On the other hand, the local cache can check whether an item is valid just before using it (and after a given timeframe from the same request), asking the remote cache. As another option, you could send updates on several changed items periodically.

The best strategy depends on the system you are going to build. The strategy could be a balance of several factors, such as frequency of updates, traffic generated by update messages, overhead of updating, criticality of missing updates, etc.

Do items of the cache vary frequently with respect to their access? How many local caches do you have? Does the change come from each local cache, a few of these, all, or from the remote cache?

Related Topic