Windows – how to speed up a database – hardware only

databasehardwarepostgresqlwindows

simple question – what is the best way to improve a database's performance using hardware only?

in this situation a few very very large tables are being queried very regularly by 1-4 processes at a time.

we are executing thousands of queries, many of which take +10s to return, and all of which return only a small quantity of data. this suggests to me that HDD seek time is the bottleneck.

as part of this process, we also need to create summary tables from our raw data tables. one of these queries may take hours to run.

please assume that all software/database optimisation has been done already.

Assume this because we have spent some time doing code/db optimisation and are ready to spend some of the budget on hardware. i understand that more software/db optimisation is possible but that is not the current focus.

we do not run out of ram at the moment, but could possibly allocate more to the DB.

the current platform is windows, this may change depending on the hardware solution.

the database is postres 8.4.

thanks.

Best Answer

What's the performance bottleneck? The usual culprits are:

  • If the system is I/O bound, adding more CPUs won't help. Adding more memory might increase the portion of the database that can be cached in memory, but the best way to improve performance will be to increase the I/O bandwidth.
  • If the system is CPU bound, then adding more cores will help if the DBMS software can make good use of them. Otherwise, increasing the speed of the CPUs will help.
  • If the system is memory bound - the bottleneck is getting the data from RAM to the CPU - then improving the memory bandwidth will improve the performance.

Note that when you manage to relieve one performance bottleneck, one of the others becomes the new performance bottleneck.

On most systems running database with seriously large data volumes being scanned, the system is I/O bound. Adding more disk controllers and spreading the data across them increases the parallelism available for disk I/O, usually improving performance.

Don't forget that the most dramatic improvements may be algorithmic - if you can answer the question two ways, one scanning 1 TB of data and one scanning just 1 KB, the second is going to outperform the first, regardless of what you do with the hardware.


Elaborating on disk controllers - as requested.

A given disk controller can only transfer so much data from disk to memory in a second, typically using DMA. To increase the bandwidth, you can spread your data over multiple disks controlled by separate controllers. Then, if the system architecture allows it, you can get more data per second from disks into the main memory - thus speeding up operations. Note that if all the data in the database is on a single disk, then extra controllers do not help. If the DBMS cannot trigger parallel reading from the separate disks, then extra controllers do not help. So, whether the extra controllers will help depends on your hardware and your software and the organization of the data in the database.

Related Topic