Data Storage - maximum number of physical disks in RAID 10

This is Interesting: Free IT Magazines  
Home > Archive > Data Storage > October 2006 > maximum number of physical disks in RAID 10





You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

Author maximum number of physical disks in RAID 10
kaioptera@gmail.com

2006-09-26, 1:14 pm

Hi,

I was wondering if anyone knows why RAID configurations top out at a
certain number of physical disks. For instance, every controller I've
looked at only supports up to 16 physical drives in a RAID 10 array, or
up to 128 in a RAID 50 array, and I don't see any reason why you can't
have unlimited numbers of disks in any configuration. Is this a
practical, theoretical, or business limit?

Thanks.

Bill Todd

2006-09-26, 1:14 pm

kaioptera@gmail.com wrote:
> Hi,
>
> I was wondering if anyone knows why RAID configurations top out at a
> certain number of physical disks. For instance, every controller I've
> looked at only supports up to 16 physical drives in a RAID 10 array, or
> up to 128 in a RAID 50 array, and I don't see any reason why you can't
> have unlimited numbers of disks in any configuration. Is this a
> practical, theoretical, or business limit?


You *can* have as many disks as you want in any RAID configuration.
Whether you *should* want to have an arbitrarily large number is a
different question - e.g., given the size of individual disks these
days, keeping an individual RAID-5 array under 10 disks total is
prudent, in order to avoid a significant probability that if one of the
disks fails you'll encounter a bad sector on one of the survivors,
losing at least that much data (though scrubbing the disks in the array
can reduce this probability a lot).

How many disks you should handle with a single controller is also a
different question, involving things like controller bandwidth limits
and port count.

- bill
Faeandar

2006-09-27, 1:16 am

On Tue, 26 Sep 2006 14:11:07 -0400, Bill Todd <billtodd@metrocast.net>
wrote:

>kaioptera@gmail.com wrote:
>
>You *can* have as many disks as you want in any RAID configuration.
>Whether you *should* want to have an arbitrarily large number is a
>different question - e.g., given the size of individual disks these
>days, keeping an individual RAID-5 array under 10 disks total is
>prudent, in order to avoid a significant probability that if one of the
>disks fails you'll encounter a bad sector on one of the survivors,
>losing at least that much data (though scrubbing the disks in the array
>can reduce this probability a lot).
>
>How many disks you should handle with a single controller is also a
>different question, involving things like controller bandwidth limits
>and port count.
>
>- bill



A suprising number of arrays these days still use FCAL internally, so
there is a physical limit in those cases.

~F
Bill Todd

2006-09-27, 1:16 am

Faeandar wrote:
> On Tue, 26 Sep 2006 14:11:07 -0400, Bill Todd <billtodd@metrocast.net>
> wrote:
>
>
>
> A suprising number of arrays these days still use FCAL internally, so
> there is a physical limit in those cases.


Just another variation on the port limit that I mentioned above (if the
array used multiple FCALs internally, then the limit would increase
commensurately). But the *RAID configuration* does not have any
intrinsic limit, just various specific *implementations*.

- bill

kaioptera@gmail.com

2006-09-27, 1:14 pm

Thanks for getting back to me, this is very interesting. I can see how
this makes sense for a RAID-5, since adding more disks increases the
probability of a dual failure and data loss, but in a RAID-10, adding
additional pairs has no effect on this probability, right? If the
probability of a given drive failing in the amount of time required for
a rebuild is P, and you lose a drive in a RAID-10 and immediately start
rebuilding from a hot spare, then the probability that you lose the
unmirrored drive is just P, regardless of how many drives are in the
array, right? The probability that you lose *any* drive will increase
with the size of the array, but as long as you maintain a certain ratio
of hot spares to active drives (I have one per enclosure), your risk of
multiple failures in different mirrored pairs exceeding your supply of
hot spares and thus increasing the amount of time the array is degraded
is also independent of the total number of drives, right?

Thanks,
Seth

Bill Todd wrote:
> Faeandar wrote:
>
> Just another variation on the port limit that I mentioned above (if the
> array used multiple FCALs internally, then the limit would increase
> commensurately). But the *RAID configuration* does not have any
> intrinsic limit, just various specific *implementations*.
>
> - bill


Bill Todd

2006-09-27, 1:14 pm

kaioptera@gmail.com wrote:
> Thanks for getting back to me, this is very interesting. I can see how
> this makes sense for a RAID-5, since adding more disks increases the
> probability of a dual failure and data loss, but in a RAID-10, adding
> additional pairs has no effect on this probability, right? If the
> probability of a given drive failing in the amount of time required for
> a rebuild is P, and you lose a drive in a RAID-10 and immediately start
> rebuilding from a hot spare, then the probability that you lose the
> unmirrored drive is just P, regardless of how many drives are in the
> array, right? The probability that you lose *any* drive will increase
> with the size of the array, but as long as you maintain a certain ratio
> of hot spares to active drives (I have one per enclosure), your risk of
> multiple failures in different mirrored pairs exceeding your supply of
> hot spares and thus increasing the amount of time the array is degraded
> is also independent of the total number of drives, right?


No: for any given per-pair repair policy (e.g., the presence of hot
spares), the probability of data loss *somewhere* goes up linearly with
the number of pairs, since the probability of loss from any one pair
remains constant and is roughly multiplied by the number of pairs to
find the probability of loss in the entire system.

Eventually, you need to add additional redundancy if you want to keep
the probability of *any* data loss acceptably low. One approach is just
to increase the redundancy of the (relatively very small) metadata to
guard against structural problems, and accept that if you lose some user
data this will be sufficiently rare that restoring just that data from
backups will be acceptable.

- bill
Ed Wilts

2006-09-29, 1:23 pm

kaioptera@gmail.com wrote:
> I was wondering if anyone knows why RAID configurations top out at a
> certain number of physical disks. For instance, every controller I've
> looked at only supports up to 16 physical drives in a RAID 10 array, or
> up to 128 in a RAID 50 array, and I don't see any reason why you can't
> have unlimited numbers of disks in any configuration. Is this a
> practical, theoretical, or business limit?


There are 2 issues and they're different.

1. RAID 5
If you lose a disk, you have read every block of every remaining
member, compute an xor, and write the new block to the new drive. 128
drives is well beyond a practical limit since the rebuild time on a
128-member 500GB spindlesize raidset would probably be measured in
weeks. Additionally, every write operation requires a write to all of
the volumes so that's going to really suck if your business requirement
is writing to these volumes.

2. RAID 10 - i.e. RAID 1 + RAID 0

The pratical limit here is usually the number of spindles that the
vendor has *tested*. Many vendors will not support any number of
drives they haven't tested. I remember the VMS limit for software
mirrorsets for a long time was around 100 sets since that configuration
had to be tested and they didn't have a test environment that big
(especially considering that each shadow set could have 3 members).

Vendors these days have enough issues testing configurations that
customers might actually use.

.../Ed

Bill Todd

2006-09-29, 7:28 pm

Ed Wilts wrote:
> kaioptera@gmail.com wrote:
>
> There are 2 issues and they're different.
>
> 1. RAID 5
> If you lose a disk, you have read every block of every remaining
> member, compute an xor, and write the new block to the new drive. 128
> drives is well beyond a practical limit since the rebuild time on a
> 128-member 500GB spindlesize raidset would probably be measured in
> weeks.


Because each surviving drive can be read in parallel, the rebuild time
for a failed drive in a RAID-5 set is largely independent of the number
of drives in the set (well, memory bandwidth limits could throttle
things back a bit and the in-memory XORing takes a bit longer the more
blocks contribute to the XOR for each block output, but that should be
down in the noise and the latter could - if anyone thought it worth the
trouble to implement this way - proceed in parallel on multiple cores).

Additionally, every write operation requires a write to all of
> the volumes so that's going to really suck if your business requirement
> is writing to these volumes.


'Fraid not: a RAID-5 write operation typically requires reading the old
data on one drive and the parity on the parity drive for the stripe that
are about to be over-written, performing the XORs, and writing the new
data and the new parity back to the two drives - i.e., only 4 disk
accesses regardless of the number of drives in the array. Large updates
that hit more than one data drive perform a read/write pair for each
additional drive hit, up to the point where it's less expensive to read
the drives that *weren't* hit and write the result back as a full stripe.

- bill
robertwessel2@yahoo.com

2006-09-29, 7:28 pm


Bill Todd wrote:
>
> 'Fraid not: a RAID-5 write operation typically requires reading the old
> data on one drive and the parity on the parity drive for the stripe that
> are about to be over-written, performing the XORs, and writing the new
> data and the new parity back to the two drives - i.e., only 4 disk
> accesses regardless of the number of drives in the array. Large updates
> that hit more than one data drive perform a read/write pair for each
> additional drive hit, up to the point where it's less expensive to read
> the drives that *weren't* hit and write the result back as a full stripe.



I think the OP meant write performance with a drive dead, not in a
fully operational state.

Of course he's wrong there too. Any write to a stripe where neither
the data block nor parity block is on the failed drive is unaffected
(and performs exactly as before), and if the parity block is on the
dead drive, writes will actually be faster. Only if the data block is
on the dead drive is it necessary to read all the other disks to do the
update of the parity block needed to complete the write.

powermt

2006-10-05, 7:14 am

In a normal RAID 5 set where the data and parity is split across the
drives we will encounter a slow performance .
The point about writing faster when the block is data n not parity no
this is not a normal feature of a RAID 5 .it has to do something called
as parity shedding in order to speed up the process but that is again
risky since your RAID 5 boils down to a mere RAID 0 with no data
protection .

robertwessel2@yahoo.com wrote:
> Bill Todd wrote:
>
>
> I think the OP meant write performance with a drive dead, not in a
> fully operational state.
>
> Of course he's wrong there too. Any write to a stripe where neither
> the data block nor parity block is on the failed drive is unaffected
> (and performs exactly as before), and if the parity block is on the
> dead drive, writes will actually be faster. Only if the data block is
> on the dead drive is it necessary to read all the other disks to do the
> update of the parity block needed to complete the write.


robertwessel2@yahoo.com

2006-10-05, 7:14 pm

Please don't top-post. Corrected below.

> robertwessel2@yahoo.com wrote:
>
>powermt wrote:
> In a normal RAID 5 set where the data and parity is split across the
> drives we will encounter a slow performance .
> The point about writing faster when the block is data n not parity no
> this is not a normal feature of a RAID 5 .it has to do something called
> as parity shedding in order to speed up the process but that is again
> risky since your RAID 5 boils down to a mere RAID 0 with no data
> protection .



We were discussing RAID arrays with a failed drive. For writes to
blocks where neiter the data block or the parity block for the stripe
or on the failed disk, the RAID-5 write performance is as usual, and,
of course, has the 2-read+2-write overhead it always does. The other
two cases I described apply to operations where the dead drive *is*
nominally involved.

Disabling parity of a RAID array is something else entirely.

Sponsored Links






Free braindumps | Software forum | Database administration forum

Copyright 2003 - 2008 webservertalk.com