A multi tenant data architecture

clouddatabase-designnhibernatesharding

We are at a planning & redesign stage of our loyalty card application. As it is understood it will be a multi-tenant application. And I need to take your thoughts. Here are the some key points we should take in to account to redesign logical and physical data structure:

  1. Campaigns: We may organize cross tenant campaigns like if you buy this from tenant A you may buy that from tenant B with 10% discount.

  2. Tenant Hierarchy: From the security point of view branches of tenants can only see their transactions not the other branch or tenant.

  3. Customization for tenants will be rare to be ignored: We consider customization will be on card holder data only (at least we will not have a chance to make customizations per tenant base in next one or two years).

  4. Card holder (not a tenant user, Also we may login for administrating and reporting all tenants) may login to system to see the transaction she made.

In general i agree with the idea of separating tenant dbs or sharding at some level. Since we have #1 criteria, it requires us to query all dbs to decide whether her transaction is shooted a campaign or not. It may be a huge performance problem. Or what do you think about sharding, does it really make performance issues? Also what would you prefer if you have to create OLAP system you have to query on multiple-sharded db for reporting and decision making purposes?

On the other hand with separating db, more active tenants will take more performance gain from system especially while taking reports. Because their transactional data will reside in their own db and one query will be executed. We may allocate more resources on system and quote meaningful price for their usage of application. Sharding will also be helpful if we consider to place this app on cloud because sqlServer azure has limit of 10gb per db.

One idea is separating dbs but maintaining a master db for cross decisions like campaigns etc (like the one mentioned here at edit part of the question). But all reports will be taken from the dbs they owned. A second idea is separation of much more active tenants and handling the others by tenantID approach in master db. Lastly we may consider nosql graph db to process transactions on campaign server.

We will also be using NHibernate for ORM tool. But sharding project left undone and we should take into account to implement one for us to implement seperate db approach using sharding or master db approach. What do you think about using nhibernate at transaction handling service? Will it be a performance penalty using NHibernate in a real time service that should perfom high speed response? If so what would you suggest?

EDIT

@rae1n To be more understandable on my certain needs :

1 – According to key point i mentioned above what data architecture would you suggest ?

  • One db and TenantID per table, replicate the db when needed.

  • Using shards with master table.

  • Using mix of sharding and tenantID.

  • Your own suggestion.

2 – Is NHibernate suitable for you suggestion for #1 and how? Could you please suggest a sample app or link to a article ?

3 – Is NHibernate suitable for campaign assessment service. Please take into account that there may be very complex campaigns amoung 1 to n tenants and we should support these behaviours.

4- Would you suggest nosql graph db for campaign assessment and do you know a sample for that usage?

Best Answer

First, I am going to suggest that you may want to post follow-up questions to dba.se. Multi-tenant applications are tricky and different RDBMSs have different tools for solving some of the problems. Without knowing exactly what your environment is, specific answers are often impossible. So I am going to look at conceptual problems and sample answers here.

For customization of cardholder data, my recommendation generally is to store the extra fields in something like XML or JSON format because this gives you a basic framework for storage which is flexible and gives your application a way to handle the data without eav pains. Some db's may even be able to index xpath search results against xml records.

Multi-tenant security is a very different matter and it is hard. The traditional standby is to use a view which can be used to filter out rows only to those which are authorized. Oracle has some ways to bolt these on to the table directly. PostgreSQL has some ways to make sure that other functions applied don't leak information.

If this becomes large, some sort of division of storage is going to be needed. How this gets divided by partitioning tables will be something you will have to decide based on your observed workload. If you are on Oracle thinking about being able to move to RAC as needed may be helpful, and if you are on PostgreSQL, Postgres-XC is the technology to keep in mind. However your current requirements make a normal sharding scheme somewhat questionable and it isn't clear what your usage patterns are going to be. How many cross-tenant cardholders are there likely to be? How much tenant vs cardholder queries are there going to be? These are the sorts of decisions which really should be made when they become problems and premature optimization here will make thins hard.

Related Topic