OpenSolaris Server – Server Hangs When Writing Large Files After Upgrading Zpool

opensolarisraidzzfs

yesterday I added new harddrives(four as a raidz1 and one as hot-spare) to a opensolaris server, after extending the zpool the server hangs when writing large files but not when reading large files(large files = > 1GiB).

The zpool configuration before the upgrade looked like this:

state: ONLINE

NAME STATE READ WRITE CKSUM
storage ONLINE 0 0 0
 raidz1 ONLINE 0 0 0
  c9t0d0 ONLINE 0 0 0
  c9t1d0 ONLINE 0 0 0
  c9t2d0 ONLINE 0 0 0
  c9t3d0 ONLINE 0 0 0

After the upgrade the zpool looks like this:

state: ONLINE

NAME STATE READ WRITE CKSUM
storage ONLINE 0 0 0
 raidz1 ONLINE 0 0 0
  c9t0d0 ONLINE 0 0 0
  c9t1d0 ONLINE 0 0 0
  c9t2d0 ONLINE 0 0 0
  c9t3d0 ONLINE 0 0 0
 raidz1 ONLINE 0 0 0
  c9t4d0 ONLINE 0 0 0
  c9t5d0 ONLINE 0 0 0
  c9t6d0 ONLINE 0 0 0
  c9t7d0 ONLINE 0 0 0
 spares
  c9t8d0 AVAIL

As you can see all drives are Online an even the 3Ware 9690SA-4I Controller tells me that everything is okey:

Unit UnitType Status %RCmpl %V/I/M Stripe Size(GB) Cache AVrfy
----------------------------------------------------------------------------- -
u0 SINGLE OK - - - 1862.63 RiW ON
u1 SINGLE OK - - - 1862.63 RiW ON
u2 SINGLE OK - - - 1862.63 RiW ON
u3 SINGLE OK - - - 1862.63 RiW ON
u4 SINGLE OK - - - 1862.63 RiW ON
u5 SINGLE OK - - - 1862.63 RiW ON
u6 SINGLE OK - - - 1862.63 RiW ON
u7 SINGLE OK - - - 1862.63 RiW ON
u8 SINGLE OK - - - 1862.63 RiW ON

VPort Status Unit Size Type Phy Encl-Slot Model
----------------------------------------------------------------------------- -
p8 OK u0 1.82 TB SATA - /c9/e0/slt1 SAMSUNG HD203WI
p9 OK u1 1.82 TB SATA - /c9/e0/slt3 SAMSUNG HD203WI
p10 OK u2 1.82 TB SATA - /c9/e0/slt5 SAMSUNG HD203WI
p11 OK u4 1.82 TB SATA - /c9/e0/slt6 SAMSUNG HD203WI
p12 OK u5 1.82 TB SATA - /c9/e0/slt8 SAMSUNG HD203WI
p13 OK u3 1.82 TB SATA - /c9/e0/slt10 SAMSUNG HD203WI
p14 OK u6 1.82 TB SATA - /c9/e0/slt13 SAMSUNG HD203WI
p15 OK u7 1.82 TB SATA - /c9/e0/slt15 SAMSUNG HD203WI
p16 OK u8 1.82 TB SATA - /c9/e0/slt17 SAMSUNG HD203WI

But when I start writing Files to this zfs the server hangs sometime during the write process and sometimes just after writing the whole file but for sure the server hangs… .
Reading large files(7-8GiB) on the otherside is no problem!

Thanks for your answers!

cu

Guido

edit:

fyi: The server runs at svn_111b

edit 2:

scrub: scrub completed after 6h20m with 0 errors on Thu Jul 22 00:33:29 2010

As you can see there are no file system errors… .

Best Answer

It's a 3+ years bug with ZFS ARC that still persists!

http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6522017

It will also go out-of-bounds from the VM limits of a hypervisor!