As of GNU coreutils 7.5 released in August 2009, sort
allows a -h
parameter, which allows numeric suffixes of the kind produced by du -h
:
du -hs * | sort -h
If you are using a sort that does not support -h
, you can install GNU Coreutils. E.g. on an older Mac OS X:
brew install coreutils
du -hs * | gsort -h
From sort
manual:
-h, --human-numeric-sort compare human readable numbers (e.g., 2K 1G)
The idea behind optimizing stripe sizes is to optimize it so, that in your typical workload most read requests can fulfilled by (a multiple of) a single read, evenly divided over all data disks. For a three disk RAID-5 set, the amount of data disks would be two.
For example, let's assume my typical workload makes I/O read requests that are on average 128kB. You would then want to make chunks of 64kB if I have a three disk RAID-5 set. Calculate it like this:
avg request size / number of data disks = chunk size
128kB / 2 = 64kB
This is the chunk size of your RAID set, we haven't arrived at doing the filesystem yet.
Next step would be to make sure the filesystem is aligned with the RAID set's characteristics. Therefore, we want to make sure the filesystem is aware of the chunk size of the RAID set. It can then evenly distribute superblocks over the three disks.
For this, we need to tell mke2fs what the size of a chunk is or, more exact, how many filesystem blocks will fit a chunk. This is called the 'stride' of the filesystem:
chunk size / size of filesystem block = stride size
64kB / 4kB = 16
You can then call mke2fs with the -E stride=16 option.
The page mentioned earlier also talks about the -E stripe-width option for mke2fs, but I have never used that myself, nor does the manpage of my version of mke2fs mention it. If we would want to use it though, we would be using it like with 32: the stripe width is calculated by multiplying the stride with the amount of data disks (two, in your case).
Now to the core of this matter: what is the optimal chunk size? As I described above, you would need the average size of an I/O read request. You can get this value by checking the appropriate column in the output of iostat or sar. You will need to do that on a system that has a comparable workload as the system you are configuring, over a prolonged period.
Make sure that you know what kind of unit the value uses: sectors, kilobytes, bytes or blocks.
Best Answer
VZFS is a virtual file system. It is used by OpenVZ (and Virtuozzo) to export a directory as the filesystem to a virtual machine. Therefore maximum file size is likely determined by the filesystem whose directory it actually is.