I've wrestled with this question for a while. There are a number of factors determining how many disks should go into a RAID5 array. I don't know the HP 2012i, so here is my generic advice for RAID5:
- Non-recoverable read error rate: When a non-recoverable read error occurs, that read fails. For a healthy RAID5 array this is no problem since the missed read can be found in the parity information. If one happens during a rebuild, when the entire RAID5 set is read in order to regenerate the parity info, it can cause the entire RAID5 array to be lost. This rate is measured like this: "1 per 1014 bits" and is found on the detail tech-specs for drives. You do not want your RAID5 array to be any more than half that size. Enterprise drives (10K RPM SAS qualifies) can go longer than Desktop drives (SATA).
- Performance degradation during rebuilds: If performance noticeably sucks during rebuilds, you want to make sure your array can rebuild quickly. In my experience write performance tends to suck a lot worse during rebuilds than reads. Know your I/O. Your tolerance for bad I/O performance will put an upper limit on how large your RAID5 array can get.
- Performance degradation during other array actions: Adding disks, creating LUNs, changing stripe widths, changing RAID levels. All of these can impact performance. Some controllers are very good about isolating the performance hit. Others aren't so good. Do some testing to see how bad it gets during these operations. Find out if restriping a 2nd RAID5 array impacts performance to the first RAID5 array.
- Frequency of expand/restripe operations: Adding disks, or on some controllers creating new LUNs, can also cause the entire array to redo parity. If you plan on active expansion, then you'll be running into this kind of performance degradation much more often than simple disk failure-rate would suggest.
RAID6 (double parity) is a way to get around the non-recoverable read error rate problem. It does increase controller overhead though, so be aware of the CPU limits on your controllers if you go there. You'll hit I/O bottlenecks faster using RAID6. If you do want to try RAID6, do some testing to see if it'll behave the way you need it to. It's a parity RAID, so it has the same performance penalties for rebuilds, expansions, and restripes as RAID5, it just lets you grow larger in a safer way.
There are up to 3 levels of alignment you need to keep in mind - 1). volume manager, 2). volume partitioning, 3). file system. If you are not using LVM then 1 is irrelevant. If you are not partitioning you volumes with fdisk, then 2 is irrelevant as well. The most important alignment for performance is 3. With proper alignmet you may see up to 15% boost in performance.
For cases 1 and 2 a good general rule would be aligning to a megabyte boundaries.
1). LVM usually does a good job by a) placing it's metadata at the end of the volume and b) giving you an option of specifying size of metadata (for example "pvcreate -M2 --metadatasize 2048K --metadatacopies 2 ")
2). If you need to partition any of these volumes with fdisk then again, try to stick to MB boundaries. Modern Linux fdisk versions have this option as well as recent version of gparted.
3). Aligning of file system is most important of all. I have experience with aligning xfs and ext3 (ext4 should be similar to ext3) and you will need to do some math here and then specify right parameters when creating the file system. Look at the documentation for specific parameters, namely something called "stripe width". Be careful though with interpretation - depending on fs type it is either expressed in 512B blocks or in bytes, so you will need to do calculations accordingly. This interpretations is also depends on number of drives in the RAID array and RAID level. You may also find some useful info in this thread.
Also, you can specify parameters when mounting a file system that may improve performance even further. Here are parameters I use with my 18TB xfs file system "noatime,attr2,nobarrier,logbufs=8,logbsize=256k". But be careful, these are not universal rules and if used incorrectly may compromise reliability of your system (especially "nobarrier").
Another thing to keep in mind that if you planning for future expansion of any of these RAID arrays you should take it into account when you create file systems since it will imminently affect your perfect alignment ;-)
I hope this points you in a right direction. Have fun :-)
Best Answer
Disk Subsystem: Here's an article from Microsoft re: partition alignment in SQL Server 2008: http://msdn.microsoft.com/en-us/library/dd758814.aspx
The theory explained in the article is why I'm giving you the link, not 'cuz I think you'll be running SQL Server. The workload of a file server is less apt to be as touchy about partition alignment as SQL Server, but every little bit helps.
NTFS:
You can disable last access time stamping in NTFS with:
You can disble short filename creation (if you have no apps that need it) with:
Think about the best NTFS cluster size for the kinds of files you're going to be putting on the box. In general, you want to have as large a cluster size as you can get away with, balancing that against wasted space for sub-cluster-sized files. You also want to try and match your cluster size to your RAID stripe size (and, as was said above, have your stripes aligned to your clusters).
There's a theory that most reads are seqential, so the stripe size (which is typically the minimum read of the RAID controller) should be a multiple of the cluster size. That depends on the specific workload of the server and you'd need to measure it to know for sure. I'd keep them the same.
If you're going to have a large number of small files you may want to start with a larger reserve for the NTFS master file table (MFT) to prevent future MFT fragmentation. As well as talking about the fsutil command above, this document describes the "MFT zone" setting: http://technet.microsoft.com/en-us/library/cc785435(WS.10).aspx Basically, you want to reserve as much disk space for the MFT as you think you'll need, based on a predicted number of files you'll have on the volume, to try and prevent MFT fragmentation.
A general guide from Microsoft on NTFS performance optimization is available here: http://technet.microsoft.com/en-us/library/cc767961.aspx It's an old document, but it gives some decent background nonetheless. Don't necessarily try any of the "tech stuff" it says to do, but get concepts out of it.
Layout:
You'll have religious arguments with people re: separating the OS and data. For this particular application, I'd probably pile everything into one partition. Someone will come along and tell you that I'm wrong. You can decide yourself. I see no logical reason to "make work" down the road when the OS partition fills up. Since they're not separate RAID volumes, there's no performance benefit to separating the OS and data into partitions. (It would be a different story if they were different spindles...)
Shadow Copies:
Shadow copy snapshots can be stored in the same volume, or on another volume. I don't have a lot of background on the performance concerns associated with shadow copies, so I'm going to stop there before I say something dumb.