Unix administration - uptime question

This is Interesting: Free IT Magazines  
Home > Archive > Unix administration > January 2004 > uptime question





You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

Author uptime question
Paul Moge

2004-01-23, 5:03 pm

hello,
i'm sure that the answer to this question may vary depending on the
admin but i'm attempting to get some sort of consensus for my boss.
being relatively new to unix but quickly becoming a convert, i am
intrested to read what you think.

What would you do to configure unix machines that need to have 99.9+%
uptime to ensure that the uptime requirements are met? What do you feel
are the primary contributing factors in achieving a high level of system
uptime?

thanks for any and all responses.

Paul M.

Michael Vilain

2004-01-23, 5:03 pm

In article <dBIbb.2533$iT4.1833193@news1.news.adelphia.net>,
Paul Moge <paulm73@adelphia.net> wrote:
quote:

> hello,
> i'm sure that the answer to this question may vary depending on the
> admin but i'm attempting to get some sort of consensus for my boss.
> being relatively new to unix but quickly becoming a convert, i am
> intrested to read what you think.
>
> What would you do to configure unix machines that need to have 99.9+%
> uptime to ensure that the uptime requirements are met? What do you feel
> are the primary contributing factors in achieving a high level of system
> uptime?



What you describe is called '3 nines' which isn't all that drastic in
terms of uptime. It only comes out to about 43 minutes of downtime per
month.

But first, find out what your boss means by "downtime". Does he mean
"the computer is down" or "the application is down" or "customers can't
use our system" or something else entirely. Depending on the scope of
his expectations, you might have to adjust your analysis. Or is he
willing to subdivide "downtime" into "unscheduled" and "scheduled" where
you can plan to shut systems down to do SW and HW modifications and
upgrades?

You can configure UNIX systems with "Fault Tolerance" or "High
Availability" by having two of every component that might fail:

On the system level:

- mirrored system and data disks
- dual disk controllers and cabling
- dual power supplies

On the network level:

- dual NIC's for access to 2 different subnets _and_ routers

On the infrastructure level:

- dual UPS' with each power supply plugged into different UPS
- secure, temperature-controlled, and monitored environment (ie. a
computer room)

On the application level:

- automatic application failover to another system
- on-line backups that don't shutdown your application

On the operations level:

- all changes to the system and applications are reviewed and planned
unless there's an unscheduled outage
- procedures for doing day-to-day operations are documented so that even
the manager could open the manual and perform them
- personnel are cross-trained in all aspects of the system operations so
that the 'oopps' factor is reduced
- clear escalation process is in place in cases of an outage

Such an architecture is expensive. When bean-counters ask for this for
their financial systems and find out the price tag can easily start at
$100K and go to over $1M, they frequently say "Never Mind".

Is this guy really looking for a 7x24x365 operation with full datacenter
or just wanting to know if he needs to reboot the servers regularly to
avoid the Blue-Screen of Death?

--
DeeDee, don't press that button! DeeDee! NO! Dee...



Sponsored Links






Free braindumps | Software forum | Database administration forum

Copyright 2003 - 2009 webservertalk.com