Data Storage - Linux sfw RAID rebuild times (was Re: Max number of drives in RAID5

This is Interesting: Free IT Magazines  
Home > Archive > Data Storage > January 2005 > Linux sfw RAID rebuild times (was Re: Max number of drives in RAID5





You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

Author Linux sfw RAID rebuild times (was Re: Max number of drives in RAID5
Ed L Cashin

2005-01-10, 5:45 pm

Faeandar <mr_castalot@yahoo.com> writes:

....
> Rebuild times are definitely a concern at 250gb. SATA not as bad but
> ATA drives that size can take 24 hours to rebuild, that's a long time
> to be vulnerable. Depending on vendor they may do some slick things
> like copy all the viable data from the failing drive first then
> reconstruct what's missing. This saves loads of time but is fairly
> uncommon still.
>
> Another problem with large drives like this is spindle performance;
> you want alot of spindles but the size of the volume is overkill (in
> alot of cases, maybe not yours).
>
> My personal preference would be to have multiple Raid 5 sets and use
> an LVM (Logical Volume Manager) to make them all seem like one volume.
> That way you get all the benefits of spindle performance, a larger
> volume size (though not as large as if it were one Raid 5 set), and
> extra protection against multi-drive failures.


I've been working with Linux Software RAID on 400 GB ATA drives. The
Linux md driver supports up to 27 disks in one array. My co-worker
found out that the rebuild times using the Linux 2.6 kernel are higher
than they need to be.

He was building a RAID 5 on nine EtherDrive storage blades and found
that the per-blade I/O rate was only 1200KB/s. At that rate, the RAID
initialization was going to take days. We know that 1200KB/s is lower
than it should be, so he checked the 2.6 kernel sources and thought he
saw a problem with the way it determines whether devices are idle or
not. He did this ...

echo 100000 > /proc/sys/dev/raid/speed_limit_max
echo 100000 > /proc/sys/dev/raid/speed_limit_min

.... and the per-blade throughput went up to about 5300 KB/s, meaning
that the array could fully initialize in about 18 hours.

I've been meaning to look at the md code myself, but I wonder if
anybody else has noticed this when initializing software RAIDs. It
might be that there's something different about the aoe block driver.

--
Ed L Cashin <ecashin@coraid.com>
Sponsored Links






Free braindumps | Software forum | Database administration forum

Copyright 2003 - 2008 webservertalk.com