Database – the best approach for database design with lots of columns

databaseMySQLsql

I am writing a query based financial application. It lets the user to write complicated equations (much like WHERE part of an SQL query) and find companies matching those criteria.

For the above, I currently have more than 500 columns in the database table (each column representing a financial field).

Example of Columns are:
company_name, sales_annual_00, sales_annual_01, sales_annual_02, sales_annual_03, sales_annual_04, protit_annual_00, profit_annual1…(over 500 such columns).

The number of rows is around 5000.

Going forward, I would like to further increase the number of columns/financial-fields.

For the above I would like to get help regarding:

1) What is the best database design approach? Is it ok to have these many number of columns?

2) How can it be normalized? (User can use any of these fields in search criteria).

3) Is it ok to stick with MySQL, or modern document based databases like MongoDB should be better for it?

P.S. (Update): I have been using MySQL till now and a running example of the usage is at: http://screener.in/companies/89/Formula–
In above there around 500 fields/columns to create your query on, however, I seek to increase that number to much more in future.

Best Answer

If this site is going to be used for ad-hoc reporting, and you anticipate a large number of rows, you should design the database as a Data Warehouse. This shifts the focus from normalization to performance and query efficiency, which sounds appropriate for the application you've described.

To learn more about this, look into Dimensional Modeling. Those tables with large numbers of columns representing numeric data would most likely be "Fact" tables and the smaller, more descriptive tables would be "Dimension" tables.

Ralph Kimball has published lots and lots of good information about designing, implementing, and maintaining data warehouses. Read his stuff!