In multi-tenant systems what is the correct practice for segregating each tenants data from each other

azurecloudcloud computingdatabase-designmultitenancy

I am a first-time azure developer and I am taking over a single-tenant cloud system and have been tasked with converting the system to multi-tenant. What is the correct practice to separate each tenant's sql server data?

Its a data collection & reporting system. Each tenant will want to build and run their own custom reports on the data. So in multi-tenant, how are tenants prevented from seeing each others data?

Best Answer

There is no one "correct" way to do anything in programming, even with a cloud-computing specific architecture such as Windows Azure.

A cursory search of the MSDN documentation reveals that Azure does in fact have some useful functionality to help you separate your client's data. It appears as if you could decide to give each client their own SQL server instance, or you could instead use a common instance that you either partition directly or leverage some built-in functionality to partition for you.

Each of those solutions have their own pros and cons, which you will need to evaluation on your own. Some conceptual guidance as to the selection:

  • If you use a separate instance for each client you will have built-in separation for your clients' data, but some reduced ability to examine data from multiple clients. Microsoft may also charge more for the additional overhead.
  • If you use the same instance but partition the data manually in a single database, you'll have to manage and adjust the various keys and indexes yourself. You obviously don't want to have a large client's 50,000,000 records slow down the query a smaller client's 500 records.
  • If you give each client a distinct schema or database in your single server instance, query management becomes easier, but you may need to worry about server-locks. (On the plus side, a seperate-db-per-client may let you make the transition easier.)
  • If you use Microsoft's bulit-in partition tools, they may very well wind up doing exactly what you want them to do. Or you may get 70% of the way throuh implementation and discover a need their tools do not cover, requiring you to abandon much of your work. (Research and proper models are important when using anyone else's toolchain.)

As for the question of "how do you move data from single-instance to multi-instance", I imagine it would largely be a matter of constructing your new architecture and then importing the old data. You'll want to know how to do that for future clients, anyway.

Related Topic