I am asking myself the exact same question at the moment.
I am leaning towards the multi-instance single tenancy solution but have not taken a definitive decision yet. Let me share some of my thoughts :
The main historical advantage of the multi-tenant architecture is a better use of infrastructure resources, by mutualisation (single OS, single Database, single application layer) and better occupying of said resources (when one user is away another can use the same resource).
It also greatly simplifies software lifecycle : you deploy new versions to you one instance, all the customers are updated at the same time.
It seems however, that recent advancements in cloud technology make the first class of advantages largely available in a multi-instance (instance-per-customer) architecture (I am thinking specifically of a platform like Jelastic here but I am sure there are others that provide the same features):
- Container-based PaaS
- Provisioning and auto-scaling of containers (elastic containers)
So hardware and platform management is not the Software provider's concern any more. Resources are mutualised much more efficiently than before at the infrastructure and plaform levels.
There will still be an overhead for multi-instance (some app and middleware will be ran N times instead of just one), but much lower than when using a separate (virtual) machine per instance. The database could be shared anyway (one schema per instance, several schemas per DB server)
Also :
- Automation of creation of new instances is possible via PaaS API
- Automation of deployment of new versions is possible via PaaS API, with zero downtime (takes some work to put in place)
- Scaling is always out, never up. We don't have to worry about huge datasets at the instance level.
Of course, we would need some kind of central service that manages all this automatically (e.g. creation of instance when a new user creates an account). This would also manage payment and licensing issues, interaction between instances etc. This central service might be quite complex and hard to develop, but the good thing is that we don't have to implement it upfront (now that we don't have much resource) whereas multi-tenant would need to be baked into the app from the start.
Which brings me to the final advantages of developing single-tenant for a very early stage (pre-invesment) startup project :
- Same (or almost same) version of the app can be deployed on-premises either as a virtual appliance or docker container or even on customer-managed machine (some companies are still reluctant towards the Cloud and it may help an early stage startup to not push out important early adopters)
- Faster to get a product out with limited resources (the application layer and database schema is quite less complex), can get a "dumb" single instance single tenant product out first (MVP) for early adopters and to show the business value of the app to potential investors, and add all the cloud automation later on
- Can be seen as a selling argument for customers worried about data security : the data is better encapsulated since every customer has his own schema or even database. Much less risk of "spillage"
NB: I am obviously thinking here of a business app where customers would be businesses (each with multiple individual users) and not individuals. It would not make any sense to run a separate instance of an app for each individual user (or would it ?)
The primary concerns of this design are security and size. But before I get there, I want to clear up a misunderstanding:
As wrong as it seems, would it be better to sacrifice normalization and include the franchise ID directly in some or all of these child tables?
I see why you might think this: if you consider franchise ID as an attribute then including it in every table violates third normal form.
But here's an alternative way to look at it: in the logical design the row key includes franchise ID: (FRANCHISE_ID, TABLE_ID)
.
But, you say, TABLE_ID
is an identity column! To which I answer: yes, but that's a physical detail, not a logical detail. And logically, tables are allowed to have multiple "candidate" keys (I turn to C. J. Date as my authority for this statement).
And once you accept this logical design, you'll get a lot of physical benefits. First, you don't need joins to access data; while, logically, joins take no time, physically they do. Plus, if your queries tend to retrieve multiple rows for the same franchise, you can also benefit by using a clustered index to collocate rows.
OK, now on to the main topics.
Security
From a corporate management perspective, this is probably your most important issue. Clearly, you can't allow one franchisee to see data that belongs to another. But there are many ways to accomplish this, imposing different levels of load on the system and its developers. I'm just going to throw out some ideas here for you to consider.
Predicate applied to individual queries
This is the simplest, but adds the heaviest load to the developers. Every one of your protected queries will have to include a check against franchise ID. Forgetting even one could have economic consequences for your company (ie, lawsuits).
However, I think you can probably overcome this with a combination of code review, static analysis, and integration testing. You need the discipline to ensure that all queries go through a data access layer that's rigorously verified.
Views
To ensure that all queries include a check for franchise ID, you can hide your tables behind views, and ensure that each view includes a franchise check. Each franchisee will have their own set of views, stored in a different schema.
An additional benefit of this approach is that you'll be able to expose data directly to the franchisees. It also allows your physical tables to change without affecting the exposed data.
However, there are several significant drawbacks. First, your developers will have to ensure that they use the correct set of views for each query (maybe not that bad, depending on how you manage connections). Second, you will have a long-term maintenance cost, as changes have to be propagated to all of schemas that hold a particular view (although this should be easily automatable).
Row-level Security
I'm not familiar with SQL-Server, but my understanding of row-level security is that it's based on database users, so you'll need (at least) one user per franchisee. Which means that you'll need (at least) one connection per franchisee, which may cause undue load for your database (or alternatively, constant creation/destruction of connections). I'm also guessing that your developers will have to code queries that include franchise conditions or suffer runtime errors. And you'll have to manage all of those users.
All-in-all, this seems like the most painful route, but it is the one that guarantees security, so I'm guessing it's the way you'll go.
Size
From the development perspective this is going to be the bigger pain point -- especially when your users complain about slow response times.
Your overriding goal should be to touch as few data blocks as possible per query. Here are a few techniques that I've used successfully in the past:
Buy as much RAM as you can afford
Your goal should be to keep the entire database in memory. Really. It doesn't matter that SSDs are blindingly fast, they still require time to read and write data blocks.
In a perfect world, you would read the entire database into memory at startup, and the only IO would be writes.
Reduce the "active" size of tables
You mentioned one 91MM row table. How much of this table is accessed for a particular query? Can you partition the table so that infrequently accessed data is stored in another table? (I'm assuming that SQL-Server supports declarative partitioning, but if not you can manually move/duplicate rows).
Large tables necessarily mean that queries have to access a lot of rows. Even if you have indexes, because those indexes will also be large.
Collocate data
By default, databases store rows wherever they can find the space. Which means that data that is typically accessed together, such as the transactions for a user, might be spread all over the disk.
However, you generally have some level of control over this, either using clustered indexes (see my link above) or covering indexes. Leverage these to their fullest.
Use read replicas
A typical application executes selects far more frequently than updates, and tends to select multiple rows while updates affect a single row.
By separating these two operations, you get a couple of benefits. First, you can scale capacity independently: if you have a lot of reads you can buy more or bigger machines. And second, you can reduce contention: a long-running select won't block an update (personally, I think this is less of an issue today than, say, 20 years ago, but it's still worth considering).
The downside of read replicas is that there's a lag between the time a row is updated on the server and on the replica. This may or may not be a problem for you (and in my experience, lag is caused by undersized machines; more money solves that problem).
Offload reporting to a data warehouse
True "reporting" queries tend to be very different from operational queries. For example, an operational query might retrieve the most recent order for a single user, while a reporting query might find all users that bought a particular product. As a result, attempting to support both operational and reporting queries with the same physical design is a recipe for failure.
At the very least, shift reporting to a dedicated read replica, one that is indexed appropriately. Better is to make use of a completely different DBMS, one whose storage and query characteristics more closely match your reporting needs. Something like Amazon Redshift, Google BigQuery, or Azure SQL Data Warehouse. Or maybe a locally-hosted option like Apache Cassandra.
And now for something completely different
Don't do this.
The time taken to develop a multi-tenant solution is time not available to add features that may be more relevant to your franchisees, or to improve the current code and processes.
If the issue is maintenance, or per-franchise capital expenses, look to alternatives that enable central management. For example, use Azure or another cloud provider with an ops person at the corporate office. You should be able to deploy a cloud-based solution for a per-franchise cost of a few hundred dollars per month (if that), cutting both capital and operational expenses for the franchisees.
If the issue is reporting, focus on more efficient data acquisition and transformation. Again, cloud-based solutions can help with this.
Update
The idea of moving to a cloud provider -- and Azure is only one option, which I picked because you seem to be a Microsoft shop -- is to eliminate the problems caused by franchisees who aren't trained computer operators.
In the simplest form, you would create a database server for each franchisee in the cloud, and their existing applications would point to that server rather than the local database. The database would always be up, so that you could retrieve data at any time. And, typically, the cloud provider does regular backups and provides other options for fault-tolerance and recovery.
Pricing in the cloud is largely dependent on the features that you want. For example, looking at the Azure Cloud SQL pricing page, the base price for a "Standard" database service is $0.0202/hour, or $15/month. I have no idea what this actually provides you in terms of database performance; in my experience, $100/month is more likely.
There is nothing that prevents you from using cloud hosting as a first step, and then moving on to a true multi-tenant solution. And if you have hundreds or thousands of franchisees, that makes sense to manage cost. But it seems like your real problem is one of operations management.
Best Answer
In the multi tennent systems I have worked on as a DBA and Developer we used 1 database per client. That database was completely self contained and did not rely on any shared database, not even for things like the states in the US.
To make standing up a new client easy a model database was created that had all the things like States prepopulated.
The reason we did not store addresses or other common data in a shared database is that it allowed us to move the databases around several different servers to normalize the load accross the servers we had. If a database needed a higher SLA due to a client upgrade we could just take a backup and restore the client database on the appropriate server and not need to worry if "the address for Stark Industries" existed on the new server.
The obvious question is what happens when client A updates the phone number or address for a customer? Should client B see that change? Wouldn't that lead to "duplicated" data if every thing is stored in seperate databases?
What if "Stark Industries" wants your Client A using a different phone number than your Client B? Then you will have to have a way to show that and store the different numbers in the Client database or move Client A and Client B to different instances. I am sure as you look through your shared database you will find additional instances of this problem of different clients wanting to use data differently.
For updates to the common tables a script that loops through each database is rather easy to write, a cursor on sys.databases and some dynamic SQL takes care of that part.
Overall while having a shared database may look good from a normalization standpoint once you start to look at different ways that clients use the different data, a shared database may not make sense any more. If you are going to go the route of seperate databases per client, just go full bore and go completely seperate with no shared data.