[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: zfs update



-----Original Message-----
From: Scott Duensing <scott@jaegertech.com>
Reply-to: silug-discuss@silug.org
To: silug-discuss <silug-discuss@silug.org>
Subject: Re: zfs update
Date: Sat, 19 Apr 2014 22:22:08 -0500

It took you a week to move 2.3T of data?  That sounds very wrong.  I have a similar setup here (10 x 4TB in RAIDZ2) and migrated my 9TB+ of data in just a couple days across GigE.

Scott,

I attribute at least part of the slowness of the first pass to the 4096:512 emulation which the ashift=12 parameter should have fixed on the return pass. There's something -- possibly inadequate caching memory space -- that ran up the load factor on the return pass to over 8 at one point. The pod has only 8GB and the smallest/slowest AMD 6000 series CPU. Another possibility is that the 1:5 SATA port expanders are inefficient.  The main server with a 2TB ZFS array has a proper server motherboard with dual quad-core 2000 series AMD CPUs and 32GB of ECC registered RAM.

--Doc

It's going to take hours/days to do anything when top says your load factor is 35+. top is also reporting a spl_kmem_cache process consuming 99.,5% of CPU cycles. I'm still digging into this, but a Google search is telling me there's a fundamental conflict between the ways Solaris and Linux manage memory for ZFS filesystems. From what I've read so far, the spl_kmem_cache process consumes extremely large volumes of memory and that winds up up thrashing to/from swap. I'm reading about strange new creatures called ARC and SPL. They seem to say (a) terabyte storage requires relatively large amounts of system RAM even when not doing deduplication, and (b) until fundamental differences between Solaris and Linux memory management are resolved, I will have to throttle the size of SPL caching to values that don't result in swap thrashing. Stay tuned.

--Doc