01-10-05 10:45 PM
Faeandar <mr_castalot@yahoo.com> writes:
...
> Rebuild times are definitely a concern at 250gb. SATA not as bad but
> ATA drives that size can take 24 hours to rebuild, that's a long time
> to be vulnerable. Depending on vendor they may do some slick things
> like copy all the viable data from the failing drive first then
> reconstruct what's missing. This saves loads of time but is fairly
> uncommon still.
>
> Another problem with large drives like this is spindle performance;
> you want alot of spindles but the size of the volume is overkill (in
> alot of cases, maybe not yours).
>
> My personal preference would be to have multiple Raid 5 sets and use
> an LVM (Logical Volume Manager) to make them all seem like one volume.
> That way you get all the benefits of spindle performance, a larger
> volume size (though not as large as if it were one Raid 5 set), and
> extra protection against multi-drive failures.
I've been working with Linux Software RAID on 400 GB ATA drives. The
Linux md driver supports up to 27 disks in one array. My co-worker
found out that the rebuild times using the Linux 2.6 kernel are higher
than they need to be.
He was building a RAID 5 on nine EtherDrive storage blades and found
that the per-blade I/O rate was only 1200KB/s. At that rate, the RAID
initialization was going to take days. We know that 1200KB/s is lower
than it should be, so he checked the 2.6 kernel sources and thought he
saw a problem with the way it determines whether devices are idle or
not. He did this ...
echo 100000 > /proc/sys/dev/raid/speed_limit_max
echo 100000 > /proc/sys/dev/raid/speed_limit_min
... and the per-blade throughput went up to about 5300 KB/s, meaning
that the array could fully initialize in about 18 hours.
I've been meaning to look at the md code myself, but I wonder if
anybody else has noticed this when initializing software RAIDs. It
might be that there's something different about the aoe block driver.
--
Ed L Cashin <ecashin@coraid.com>
[ Post a follow-up to this message ]
|