Solaris ZFS volumes: workload not hitting L2ARC

solarissolaris-11zfszfs-l2arc

I've set up a Solaris Express 11 machine with some reasonably fast HDDs behind a RAID controller, set the device up as a zpool with compression enabled and added a mirrored log and 2 caching devices to it. The datasets are exposed as FC targets for use with ESX and I've populated it with some data to play around with. The L2ARC partially filled up (and for some reason not filling anymore), but I hardly see any use of it. zpool iostat -v shows that not much has been read from the cache in the past:

tank           222G  1.96T    189     84   994K  1.95M
  c7t0d0s0     222G  1.96T    189     82   994K  1.91M
  mirror      49.5M  5.51G      0      2      0  33.2K
    c8t2d0p1      -      -      0      2      0  33.3K
    c8t3d0p1      -      -      0      2      0  33.3K
cache             -      -      -      -      -      -
  c11d0p2     23.5G  60.4G      2      1  33.7K   113K
  c10d0p2     23.4G  60.4G      2      1  34.2K   113K

and the L2ARC-enabled arcstat.pl script shows 100% misses for L2ARC for the current workload:

./arcstat.pl -f read,hits,miss,hit%,l2read,l2hits,l2miss,l2hit%,arcsz,l2size 5
read  hits  miss  hit%  l2read  l2hits  l2miss  l2hit%  arcsz  l2size
[...]
 243   107   136    44     136       0     136       0   886M     39G
 282   144   137    51     137       0     137       0   886M     39G
 454   239   214    52     214       0     214       0   889M     39G
[...]

I first suspected it might be an impact of the recordsize being too large so that L2ARC recognizes everything as a streaming load, but the zpool contains nothing but zfs volumes (I've created them as "sparse" using zfs create -V 500G -s <datasetname>) which do not even have a recordset parameter to change.

I also have found many notions about L2ARC needing 200 Bytes of RAM per record for its metadata, but was so far unable to find out what L2ARC would consider a "record" with a volume dataset – a single sector of 512 Bytes? May it be suffering from RAM shortage for metadata and just so far be filled up with junk that is never read again?

Edit: Adding 8 GB of RAM on top of the 2 GB alredy installed worked out nicely – the additional RAM is happily used even in a 32-bit installation and the L2ARC now has grown and is getting hit:

    time  read  hit%  l2hit%  arcsz  l2size
21:43:38   340    97      13   6.4G     95G
21:43:48   185    97      18   6.4G     95G
21:43:58   655    91       2   6.4G     95G
21:44:08   432    98      16   6.4G     95G
21:44:18   778    92       9   6.4G     95G
21:44:28   910    99      19   6.4G     95G
21:44:38  4.6K    99      18   6.4G     95G

Thanks to ewwhite.

Best Answer

You should have more RAM in the system. Pointers to L2ARC need to be kept in RAM (ARC), so I think you'd need around 4GB or 6GB of RAM to better utilize the ~60GB of L2ARC you have available.

This is from a recent thread on the ZFS list:

http://opensolaris.org/jive/thread.jspa?threadID=131296

L2ARC is "secondary" ARC. ZFS attempts to cache all reads in the ARC 
(Adaptive Read Cache) - should it find that it doesn't have enough space 
in the ARC (which is RAM-resident), it will evict some data over to the 
L2ARC (which in turn will simply dump the least-recently-used data when 
it runs out of space). Remember, however, every time something gets 
written to the L2ARC, a little bit of space is taken up in the ARC 
itself (a pointer to the L2ARC entry needs to be kept in ARC). So, it's 
not possible to have a giant L2ARC and tiny ARC. As a rule of thumb, I 
try not to have my L2ARC exceed my main RAM by more than 10-15x (with 
really bigMem machines, I'm a bit looser and allow 20-25x or so, but 
still...). So, if you are thinking of getting a 160GB SSD, it would be 
wise to go for at minimum 8GB of RAM. Once again, the amount of ARC 
space reserved for a L2ARC entry is fixed, and independent of the actual 
block size stored in L2ARC. The jist of this is that tiny files eat up 
a disproportionate amount of systems resources for their size (smaller 
size = larger % overhead vis-a-vis large files).