NetApp FAS 3020c, volume is full

netappstorage

We've noticed a problem with one of our volumes on netapp filer. It seems that volume is full, NetApp informs, that volume is using or reserving 100% of space (0% of inodes) – this is shown as the warning.
Problem is, it does not looks this way.
Volume size is 190 GB. Volume is Flexible type, file space guarantee, no mirroring. We have exactly two LUNs on the volume mapped. 95 GB and 50 GB. Both of them are set to reserve 0% for snapshots. Both have space reservation. There is still much space on the volume (in theory).
df -r shows:

Filesystem              kbytes       used      avail   reserved  Mounted on
/vol/BACKUP/   199229440  199229440          0  142799672  /vol/BACKUP/

There is also some free space left on aggregate. We have on the same aggregate similar volumes with LUNs (same configuration) and they are completely fine. We have new shelf and we want to migrate there some data, before we install new shelf, we want to make sure, that we have backups of all the data. However, because of this one, particular volume, backup fails (no free space for the snapshot).

added:
If i check space occupied on production system, where both LUNs are mapped, it's only 94 GB.

Best Answer

Take a look at man vol and read the bit about fractional reserve - this is the root of your problem.

Specifically - when LUNs run out of space, they break horribly and can cause chaos on the host. NetApp allows you to take snapshots of volumes - a snap uses space in proportion to the changed blocks on the volume. If your volume fills up, and you cannot allocate new blocks, because there's a snapshot present... your LUNs will all break.

So in comes fractional reserve, which says 'whenever I take a snap, reserve volume space so that I don't risk running out'. Set to 100, each volume (when a snap exists) tries to reserve space equal to the sum total of the allocated LUN space - meaning the volume needs to be 200% of the size in order to be sure you don't run out.

Lowering fractional reserve is a risk, but not a big one if you don't routinely cycle all the data in your LUNs. Just bear in mind that running out will mean write failures to the LUNs, and that's generally bad news. You could also adjust the volume guarantee options - file guarantee combined with fractional reserve 100 means your volume will need to be 200% the size of the LUNs within (+some, if you've multiple snaps, although it won't be +100% per snap).

Related Topic