Designing a large database with multiple sources

database-designefficiencysql server

I have been tasked with redesigning, or at worst optimising the structure of a database for a data warehouse.

Currently, the database has 4 other source databases (which is due to expand to X number of others), all of which have their own data structures, naming conventions etc. At the moment an overnight SSIS package pulls the data from the various source and then for each source coverts the data into a standardised, usable format. These tables are then appended to each other creating a 60m row, 40 column beast!.

This table is then used in a variety of ways from an OLAP cube to a web front end.

The structure has been in place for a very long time, and the work I have been able to prove the advantages of normalisation, and this is the way I would like to go. The problem for me is that the overnight process takes so long I don't then want to spend additional time normalising the last table into something usable.

Can anyone offer any insight or ideas into the best way to restructure or optimise the database efficiently?

Edit:

All the databases are MS SQL Server 2008 R2

Thanks in advance

CM

Best Answer

For OLAP databases, normalization is often not the best approach - this is completely different from classical OLTP databases. The structure of your tables should be optimized for the queries you are going to run. I recommend the Wikipedia articles about star schema or snowflake schema, those are patterns for a good OLAP database design.

Here is a book about the topic I can recommend:

http://www.amazon.com/The-Data-Warehouse-Toolkit-Dimensional/dp/0471153370

Something you did not write (but really ask yourself) is why you actually want to restructure the system. Just because it is denormalized and you think this is not "best practice"? Or do you suffer from real performance or storage problems? If it is only the first reason, you should first read something more about good OLAP db design before changing the system.

Related Topic