Data Storage - General musings and/or recommendations on number of global spares to keep?

This is Interesting: Free IT Magazines  
Home > Archive > Data Storage > August 2005 > General musings and/or recommendations on number of global spares to keep?





You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

Author General musings and/or recommendations on number of global spares to keep?
Dan Stromberg

2005-08-24, 5:57 pm


I've been working on a Sun StorEdge 3511 with dual RAID controllers and
three expansion boxes.

We originally purchased the equipment expecting to get 16 terabytes of
usable space.

Now that it's "all set up", we're really seeing more like 14 or 15
terabytes, depending on how you do the calculation.

The Sun channel partner we're working with is advising that we go from our
current 4 global spares, down to either 1 or 2 global spares, using the
additional 3 or 2 disks for data.

The number of disks in the system totals 48, including data and parity and
global spares.

Please be sure to use a fixed-pitch font when viewing the tables found below.

What we have right now is:

global spares: 0,16,32,48

Raidset Disks used Data:parity ratio
0 1,2,3,4,5,6,7,8,9,10 9:1
1 11,17,18,19,20,21,22,23,24,25 9:1
2 26,27,33,34,35,36,37,38,39,40 9:1
3 41,42,43,49,50,51,52,53,54,55 9:1
4 56,57,58,59 3:1


And the vendor is suggesting that we move to something like:

global spares: 0

Raidset Disks used Data:parity ratio
0 1,2,3,4,5,6,7,8,9,10 9:1
1 11,17,18,19,20,21,22,23,24,25 9:1
2 26,27,33,34,35,36,37,38,39,40 9:1
3 41,42,43,49,50,51,52,53,54,55 9:1
4 56,57,58,59,16,32,48 3:1

....or...:

global spares: 0,16

Raidset Disks used Data:parity ratio
0 1,2,3,4,5,6,7,8,9,10 9:1
1 11,17,18,19,20,21,22,23,24,25 9:1
2 26,27,33,34,35,36,37,38,39,40 9:1
3 41,42,43,49,50,51,52,53,54,55 9:1
4 56,57,58,59,32,48 3:1


Does anyone have any comments on:

1) The sanity of these 10 disk RAID 5's?

2) The degree of loss of reliability incurred by moving 3 disks from
global spare to data?

3) The degree of loss of reliability incurred by moving 2 disks from
global spare to data?


To answer these questions, you probably need to know how the storage is to
be used. This single, large QFS filesystem will be used by a variety of
researchers and students from around The university of California, Irvine,
but was purchased primarily by the Earth System Science part of the
Physical Sciences department, which in turn will primarily be storing many
approximately 100 megabyte files which comprise time series related to
climatology simulations.

They don't feel that the storage has to be blazing fast, and 100% uptime
isn't paramount, however they very much do not want to lose their data.

The filesystem will not be backed up - we simply don't have anything large
enough to back it up -to-, so if the some part of the storage solution
goes kerflooey, we're totally... er... out of luck, and they'll probably
be looking at me (the primary sysadmin on the storage configuration),
wondering why their data is gone.

Thanks!

Faeandar

2005-08-24, 8:47 pm

On Wed, 24 Aug 2005 21:21:02 GMT, Dan Stromberg
<strombrg@dcs.nac.uci.edu> wrote:

>
>I've been working on a Sun StorEdge 3511 with dual RAID controllers and
>three expansion boxes.
>
>We originally purchased the equipment expecting to get 16 terabytes of
>usable space.
>
>Now that it's "all set up", we're really seeing more like 14 or 15
>terabytes, depending on how you do the calculation.
>
>The Sun channel partner we're working with is advising that we go from our
>current 4 global spares, down to either 1 or 2 global spares, using the
>additional 3 or 2 disks for data.
>
>The number of disks in the system totals 48, including data and parity and
>global spares.
>
>Please be sure to use a fixed-pitch font when viewing the tables found below.
>
>What we have right now is:
>
>global spares: 0,16,32,48
>
>Raidset Disks used Data:parity ratio
>0 1,2,3,4,5,6,7,8,9,10 9:1
>1 11,17,18,19,20,21,22,23,24,25 9:1
>2 26,27,33,34,35,36,37,38,39,40 9:1
>3 41,42,43,49,50,51,52,53,54,55 9:1
>4 56,57,58,59 3:1
>
>
>And the vendor is suggesting that we move to something like:
>
>global spares: 0
>
>Raidset Disks used Data:parity ratio
>0 1,2,3,4,5,6,7,8,9,10 9:1
>1 11,17,18,19,20,21,22,23,24,25 9:1
>2 26,27,33,34,35,36,37,38,39,40 9:1
>3 41,42,43,49,50,51,52,53,54,55 9:1
>4 56,57,58,59,16,32,48 3:1
>
>...or...:
>
>global spares: 0,16
>
>Raidset Disks used Data:parity ratio
>0 1,2,3,4,5,6,7,8,9,10 9:1
>1 11,17,18,19,20,21,22,23,24,25 9:1
>2 26,27,33,34,35,36,37,38,39,40 9:1
>3 41,42,43,49,50,51,52,53,54,55 9:1
>4 56,57,58,59,32,48 3:1
>
>
>Does anyone have any comments on:
>
>1) The sanity of these 10 disk RAID 5's?
>
>2) The degree of loss of reliability incurred by moving 3 disks from
>global spare to data?
>
>3) The degree of loss of reliability incurred by moving 2 disks from
>global spare to data?
>
>
>To answer these questions, you probably need to know how the storage is to
>be used. This single, large QFS filesystem will be used by a variety of
>researchers and students from around The university of California, Irvine,
>but was purchased primarily by the Earth System Science part of the
>Physical Sciences department, which in turn will primarily be storing many
>approximately 100 megabyte files which comprise time series related to
>climatology simulations.
>
>They don't feel that the storage has to be blazing fast, and 100% uptime
>isn't paramount, however they very much do not want to lose their data.
>
>The filesystem will not be backed up - we simply don't have anything large
>enough to back it up -to-, so if the some part of the storage solution
>goes kerflooey, we're totally... er... out of luck, and they'll probably
>be looking at me (the primary sysadmin on the storage configuration),
>wondering why their data is gone.
>
>Thanks!



I actually thought you were a little paranoid on your layout until I
got to the last section. Now I think you're not paranoid enough.

If availability is paramount, and you're not backing it up somewhere,
then I think raid 5 alone is a resume trigger.
Regrettably any solution that would let me sleep at night would
require alot more capacity than you currently have. But raid 1+0
would be my recommendation with 4 global spares.

Now, having said that, and putting aside the no backup policy
(/shiver) I think you could easily get away with 2 global spares for
the number of drives you have.

For 168 drives I have between 3 and 6 global spares. So for 48 drives
I would personally be comfortable with 2 spares, *if* I had backups.

Honestly, the no backup policy is freaky. Especially since they
actually want the data to stick around.

Question: If you're running QFS why not slap SAMFs on it as well and
use the tape store as a psuedo backup plan? It could be proclaimed
additional capacity for the users while acting as a safety net for
them too.

~F
GG

2005-08-24, 8:47 pm

> The filesystem will not be backed up - we simply don't have anything large
> enough to back it up -to-, so if the some part of the storage solution
> goes kerflooey, we're totally... er... out of luck, and they'll probably
> be looking at me (the primary sysadmin on the storage configuration),
> wondering why their data is gone.


What about when a user deletes data and wants it back?


Dan Stromberg

2005-08-24, 8:47 pm

On Thu, 25 Aug 2005 01:16:48 +0000, GG wrote:

>
> What about when a user deletes data and wants it back?


The official policy (not selected by me, but rather by the Professor
who's paying for my time) is going to be "Your home directory is backed
up, but the research data you put under /data is not unless you make
special arrangements for that yourself".
victor.engle@gmail.com

2005-08-26, 5:48 pm

Dan,

I think I would prepare a good argument for backups and try and
persuade the professor in favor of them. One thing I might change in
your raid set plan is the 10 disk raid5's. The reason is that the
rebuild time will be longer than for smaller raid sets which makes you
slightly more vulnerable to losing a second disk during the rebuild.
For your number of disks though I do believe 2 spares is adequate but
on the minimum side.

Regards,
Vic Engle

Sponsored Links






Free braindumps | Software forum | Database administration forum

Copyright 2003 - 2008 webservertalk.com