|
Home > Archive > Data Storage > December 2005 > (S)ATA robustness
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
|
|
|
| I'm wondering how much difference there exists between SATA and, say
FCAL drives when it comes to robustness.
I'm talking drives that will be more than 20% busy with seeks, always
spinning, in news servers. I've seen a too-high rate on FCAL disks used
as JBOD - about half the drives developing bad blocks, the other half
giving total failures - that I'm not sure going to mirrored SATA drives
is such a fine idea.
Would they wear out in a year, or would the reliability be reasonable?
Mirroring would catch most failures, but will the drive vendor complain
or stop support because of the failure rate?
Thomas
| |
| _firstname_@lr_dot_los-gatos_dot_ca.us 2005-12-10, 2:46 am |
| In article <439a0228$0$17099$6c4159fb@news.tweaknews.nl>,
Zak <jute@zak.invalid> wrote:
>I'm wondering how much difference there exists between SATA and, say
>FCAL drives when it comes to robustness.
>
>I'm talking drives that will be more than 20% busy with seeks, always
>spinning, in news servers. I've seen a too-high rate on FCAL disks used
>as JBOD - about half the drives developing bad blocks, the other half
>giving total failures - that I'm not sure going to mirrored SATA drives
>is such a fine idea.
>
>Would they wear out in a year, or would the reliability be reasonable?
>Mirroring would catch most failures, but will the drive vendor complain
>or stop support because of the failure rate?
Read the archives of this newsgroup. Get a copy of the paper "More
than an interface - SCSI vs. ATA" by Anderson, Dykes and Riedel.
You are mixing up a heck of a lot of (admittedly related) issues here
[my opinions / guesses in square brackets]
- Is a SATA disk inherently more or less reliable than a FCAL disk?
[not in principle, but in practice most are]
- Is a consumer-grade disk inherently less reliable than an
enterprise-grade disk? [yes]
- Are SATA disks always consumer grade disks? [mostly] (FCAL disks are
always enterprise grade disks.)
- Does the lifespan of a disk depend on the duty cycle? [yes for
consumer grade, no for enterprise grade]
- For what duty cycle are disks rated? [10% or less for consumer
grade, 100% for enterprise grade]
- Is 1/2 of the FCAL drives in a JBOD failing within some time period
normal? [not unless you wait a heck of a long time, or some comon
factor is killing the disks, for example high heat or vibration]
- Will mirroring help reliability? [yes, and it will also help with
read speed, but you have to be ready to very quickly remove failed
drives, and should set the disk array up with hot spares and
automatic re-mirroring of failed drives to hot spares]
- Would mirrored SATA drives have higher reliability than non-mirrored
FCAL drives? [depends on too many factors for a simple answer]
- Will the drive vendor complain if the drives fail? [depends on what
you bought. On one extreme, if you buy an enterprise-class disk
array with a maintenance contract from a first class vendor like
HDS/IBM/EMC, they will replace all failed disks. On the other
extreme, if you buy a box of 20 consumer-grade disks from a cheap
mail-order distributor, run them into the ground by exceeding their
duty cycle and maybe exposing them to heat and vibration due to a
really cheap crappy JBOD, and then return lots of them under
warranty, you will get pushback. Remember that the SMART data on
the disk records heat and vibration, so the vendor can tell that you
abused the drives. There is a lot of room between these two
extremes.]
Observe that I did not give you advice on what the best solution for
you is. That depends on too many factors; most importantly, the depth
of your pocketbook versus your tolerance for pain and suffering, which
is to say whether availability or low cost is more important to you.
Good luck!
--
The address in the header is invalid for obvious reasons. Please
reconstruct the address from the information below (look for _).
Ralph Becker-Szendy _firstname_@lr_dot_los-gatos_dot_ca.us
| |
|
| _firstname_@lr_dot_los-gatos_dot_ca.us wrote:
> Observe that I did not give you advice on what the best solution for
> you is. That depends on too many factors; most importantly, the depth
> of your pocketbook versus your tolerance for pain and suffering, which
> is to say whether availability or low cost is more important to you.
Thanks - I have some new insights now. Some other thing that occured to
me: look at manufacturer data.
Seagate publish an IDC document: MTBF of desktop ATA drive is half that
of enterprise FCAL drive, and they give MaXline and their own NL35
series as filling the gap.
Cheetah 10K:
10E-15 unrecoverable read errors per bit read
0.62% annual failure rate
NL35:
10E-14 unrecoverable read errors per bit read
MTBF specified for 'nearline workload': 1M hours, which is 0.9% AFR.
Barracuda:
10E-14 unrecoverable read errors per bit read
no MTBF or even seek time specified. Yuk, they still call it a data sheet.
Now, my workload will read about 2 megabytes/sec per drive - for FCAL
I'd be aiming at 3 MB/s. SATA will give me a bad block once every 30
years on a single drive when I just use the published rate.
This is about the rate that I see. Thus, SATA without RAID to catch more
errors is not usable - and neither is FCAL.
But it is drives breaking that worries me.
Thomas
| |
| Thor Lancelot Simon 2005-12-10, 5:48 pm |
| In article <439a849b$0$26397$6c4159fb@news.tweaknews.nl>,
Zak <jute@zak.invalid> wrote:
>
>Now, my workload will read about 2 megabytes/sec per drive - for FCAL
>I'd be aiming at 3 MB/s. SATA will give me a bad block once every 30
>years on a single drive when I just use the published rate.
I can't imagine where you get the idea that there is "[a] published rate"
of bad block errors applicable to all SATA disk drives.
You should, at least, look at the data, stated design goals, and warranty
conditions for the WD Raptors; they are explicitly claimed to be
"enterprise" drives intended to replace SCSI and FC drives at the high end.
--
Thor Lancelot Simon tls@rek.tjls.com
"The inconsistency is startling, though admittedly, if consistency is to be
abandoned or transcended, there is no problem." - Noam Chomsky
| |
|
| Thor Lancelot Simon wrote:
> I can't imagine where you get the idea that there is "[a] published rate"
> of bad block errors applicable to all SATA disk drives.
The data sheet. If the data sheet tells me there is less than 1
uncorrectable read error per 10^15 BITS read, that maps to a certain
number of bad blocks occurring. For (S)ATA the rate is 10 times as high,
at least in the data sheet.
Thomas
| |
| Thor Lancelot Simon 2005-12-11, 8:46 pm |
| In article <439beefa$0$27058$6c4159fb@news.tweaknews.nl>,
Zak <jute@zak.invalid> wrote:
>Thor Lancelot Simon wrote:
>
>
>The data sheet. If the data sheet tells me there is less than 1
>uncorrectable read error per 10^15 BITS read, that maps to a certain
>number of bad blocks occurring. For (S)ATA the rate is 10 times as high,
>at least in the data sheet.
"The" data sheet? The one data sheet, just one, that happens to have
the single bad block rate that just happens to be the same for every
model not just of SATA but of ATA drives ever manufactured on it?
Don't be silly. You can't extrapolate from a single data sheet that
gives a number for one kind -- or a few data sheets that give numbers
for a few kinds -- of ATA or SATA drives -- to some kind of general
number that's magically relevant for _all_ SATA drives, precisely
because there is no _causal_ connection (running either way) between
the host interface and the mechanical engineering of any disk drive.
Reliability numbers for _some_ SATA drives -- for example, the WD
Raptors I gave as examples in the part of the message you snipped
off -- are comparable to those for "enterprise SCSI" drives. Look
at them yourself and see. Of course, this is not the case for *all*
SATA drives, which only goes to point out how your analysis was too
simplistic to prove the result you claimed.
--
Thor Lancelot Simon tls@rek.tjls.com
"The inconsistency is startling, though admittedly, if consistency is to be
abandoned or transcended, there is no problem." - Noam Chomsky
| |
| Curious George 2005-12-15, 8:46 pm |
| On Sun, 11 Dec 2005 10:21:37 +0100, Zak <jute@zak.invalid> wrote:
>Thor Lancelot Simon wrote:
>
>
>The data sheet. If the data sheet tells me there is less than 1
>uncorrectable read error per 10^15 BITS read, that maps to a certain
>number of bad blocks occurring.
an uncorrectable read or write error /= a bad block. Errors happen
all the time on good media - the trick is they need to be caught &
dealt with at a high rate of success. So they're talking errors from
good media, not failure expectations of media AFIK.
>For (S)ATA the rate is 10 times as high,
>at least in the data sheet.
That should tell you something.
|
|
|
|
|