Postgresql – best RAID configuration for postgres

hardware-raidpostgresqlraid

I'm purchasing a server with 8 SAS disks to perform database intensive procedures. Currently the main bottleneck is is large index scans in postgres.

I'm currently deciding between 8x300Gb 10k disks or 8x140Gb 15k disks as it would be more convenient to have 200Gb+ Logical space.

The spec sheet for the RAID controller states: "Integrated Hardware RAID-0, -1, -1E, optional RAID-5, -6, -10, -50, -60"

What would be the best RAID configuration, and what choice in disks would be most suitable?

I'm new to configuring RAID and postgres and appreciate the advice.

Best Answer

Integrated Hardware RAID-0, -1, -1E, optional RAID-5, -6, -10, -50, -60

This sounds a little worrisome to me, it sounds like a low-end RAID controller. You want a good RAID controller that can keep up with 8 fast HDDs (that's actually not a given). If you have a fair amount of writes to your DB, then you really want a Battery Backup Unit, and to enable battery-protected write caching on the RAID controller.

As for RAID disk layout, there are 2 common schools of thought:

  1. 2 disks in mirror for OS, 2 disks in mirror for DB transaction log, 4 disks in RAID 10 for main DB files.
  2. One big RAID 10 array using all disks, and all OS + log + datastore files on this array (see reasoning here, mirrored by BAARF).

I would rather not take sides on the RAID volume design, it tends to become a bit of a fact-light discussion. Ideally you should experiment with different storage layouts and benchmark them for your specific workload. My gut feel is that all disks in RAID10 is faster and more robust over multiple workloads.

One last thing, to make sure that OS partitions and RAID stripe boundaries are aligned (see here, Windows centric, but the principle is general). You can do this when you create the partitions.