Caching query results vs Querying each time (finding a middle ground)

cachingperformancesql server

This is kind of an expansion on a previously asked question

Some background info:

Querying Over 5,000 items from SQL database that eventually get sorted
into a list depending on who the user is. One problem is that users
are allowed to have customized lists which means that filtering these
5,000+ items in a uniform way is probably impossible.

Also, important that the amount of users (and users wanting to have
custom item lists) is constantly growing, so there will be an
unprecedented amount of ways to sort. Basically, we won't be paying
attention to filtering this growing list.

Two methods we've come up with so far:
-Option 1: Cache item list PER USER (and filter the items needed after) OR
-Option 2: Cache the list once for the whole site (every user accesses this one cached list)

The issue with Option 1 is that there will be duplicated lists in
cache. Although we do not have an amount of users that will strain our
server's memory, we obviously want our system to have scalability.

The issue with Option 2 is that there will be duplicated items in the
single, cached list. With the amount of users we have, this can
quickly turn into a disorganized list full of duplicated items (i.e.
5,000 items can quickly turn into 10,000 based on a few user's custom
item preferences)

Is there a "middle ground" for dealing with this situation? One that combines some benefits of querying and caching.

Even if a response is a link to a potentially helpful source, I would appreciate it as I am kind of at a fork in the road.

Looking to open up new possible paths in my brainstorming.

Best Answer

Some may not feel this is not an answer, but it is a way to derive a better answer.

Collect data on user behavior. Many users may prefer a default configuration. If so, it would make sense to cache this list. I've seen some apps that are highly configurable, but I just use it the way it came out of the box. I'm sure there are hundreds of settings for Microsoft Word that 90% of users never bother using.

Otherwise, pull it from the database/disk. Depending on the size of the db and your particular RDBMS, most of your data may be in memory anyway (I know SQL Server does this.). This is where you'll start seeing any bottlenecks. You can address those later.

Don't be too anxious to solve a problem too soon. It may not be a problem. The wrong problem or the wrong solution. As much as you may want some sort of "best practice" or standard use case, there probably isn't one that matches your situation.

Related Topic