RAID 5 30TB File storage – filesystem and strip size on large files

filesystemsraidraid5

Nowadays my storage is 6TB and as I will grow up to 30TB in few months I would like to hear some tips/recommendations on filesystem and element strip size to not to have problems in future.
90% of files are 700MB-4GB (mainly large video files and archives)

Now I am using ext4 and 64KB strip size. Should I increase strip size to 128KB/256KB ? Would be zfs or xfs better than ext4? Actual usage is 85% read and 15% write. In the future when the enclosure is full, read will be 100% and I would like to have best through put rate.

Best Answer

Try that: do not use Raid 5 on anything 2gb or larger in drives ;) For 30tb I would even go with Raid 6 mirrored (i.e. 2 copies in software raid) to make sure I keep the data in case of corruption.

Now I am using ext4 and 64KB strip size. Should I increase strip size to 128KB/256KB ? >

Hard or software? Generally yes - it is a lot less work to read more data than to come back later. Not a Linux guy here - but SQL Server for example does read 64kb extends but tries to keep table data in linear blocks so IO is reduced. A good large file system will try the same, which means a larger than 64gb IO segment size is good.

I remember analysis of enterprise level Raid controllers that showed an increase in throughput at 512kb / 256kg compared to smaller sizes. Especially if you ahve enough caching to make it "stick" on the Raid controller level.

A lot also depends on read. LArge archives and filesa re mostly linear non random access. That will fly. I have a smaller system but we do redundant reading from nearly 200 processes on it on a larger number of machines, the machines with 1gb, the storage with 10 - so it is HEAVILY random IO coming in and I use a Raid 6 now of 8 velociraptors. THat is half a gigabyte per second delivered. 256kb Stripe, Raid 6, 1gb cache on an Adaptec 71605Q. SSD as cache available but not active for that group ;)

A lot depends on read patterns.

But stay away from Raid 5 for those large drives. That is gambling the data - unless you can live without the Raid (during a full rebuild when the raid blows during a rebuild due to a drive failure) and have another backup source (like tapes). YOu can basically expect a problem with that many 4tb drives, mathematically.

Related Solutions

Windows file server performance tuning

Disk Subsystem: Here's an article from Microsoft re: partition alignment in SQL Server 2008: http://msdn.microsoft.com/en-us/library/dd758814.aspx

The theory explained in the article is why I'm giving you the link, not 'cuz I think you'll be running SQL Server. The workload of a file server is less apt to be as touchy about partition alignment as SQL Server, but every little bit helps.

NTFS:

You can disable last access time stamping in NTFS with:

fsutil behavior set disablelastaccess 1

You can disble short filename creation (if you have no apps that need it) with:

fsutil behavior set disable8dot3 1

Think about the best NTFS cluster size for the kinds of files you're going to be putting on the box. In general, you want to have as large a cluster size as you can get away with, balancing that against wasted space for sub-cluster-sized files. You also want to try and match your cluster size to your RAID stripe size (and, as was said above, have your stripes aligned to your clusters).

There's a theory that most reads are seqential, so the stripe size (which is typically the minimum read of the RAID controller) should be a multiple of the cluster size. That depends on the specific workload of the server and you'd need to measure it to know for sure. I'd keep them the same.

If you're going to have a large number of small files you may want to start with a larger reserve for the NTFS master file table (MFT) to prevent future MFT fragmentation. As well as talking about the fsutil command above, this document describes the "MFT zone" setting: http://technet.microsoft.com/en-us/library/cc785435(WS.10).aspx Basically, you want to reserve as much disk space for the MFT as you think you'll need, based on a predicted number of files you'll have on the volume, to try and prevent MFT fragmentation.

A general guide from Microsoft on NTFS performance optimization is available here: http://technet.microsoft.com/en-us/library/cc767961.aspx It's an old document, but it gives some decent background nonetheless. Don't necessarily try any of the "tech stuff" it says to do, but get concepts out of it.

Layout:

You'll have religious arguments with people re: separating the OS and data. For this particular application, I'd probably pile everything into one partition. Someone will come along and tell you that I'm wrong. You can decide yourself. I see no logical reason to "make work" down the road when the OS partition fills up. Since they're not separate RAID volumes, there's no performance benefit to separating the OS and data into partitions. (It would be a different story if they were different spindles...)

Shadow Copies:

Shadow copy snapshots can be stored in the same volume, or on another volume. I don't have a lot of background on the performance concerns associated with shadow copies, so I'm going to stop there before I say something dumb.

What Block Sizes and Stripe Sizes will cause performance issues using RAID5

See, here you err....

lets say both are 64kb, what happens when I run a write operation that's only a few bytes on a section of the drive that's full?(perhaps a large sql server database)

This is not possible.

Besides the fact that NTFS actually has blocks of 4kb or more - and it is STRONLY adviced to set that to 64kb for SQL Server.

SQL Server manages 8kb pages, and always reads /Writes 8 pages as extent - 64kb.

http://msdn.microsoft.com/en-us/library/ms190969(v=sql.105).aspx

As a result, for SQL Server, there is no thing such as writing a few bytes. It will write out 64kb.

As such, for SQ LServer, it is recommended to use a 64kb NTFS node size (so the extends do not cause split IO) and obviously a multiply of 64kb Raid strize size (as enterprise edition of SQL Server loves reading ahead exteds).

For other elements things are similar - not SQL Server... it depends on the intelligence of the programmer and the patterns of access of specific software.

Best Answer

Related Solutions

Windows file server performance tuning

What Block Sizes and Stripe Sizes will cause performance issues using RAID5

Related Topic