Let's get something going here...
Web Server forum
Back To The Forum Home!Search!Private Messaging System

Web Server Talk Web Server Talk > Unix and Linux reviews > Free Unix support > Unix administration > Let's get something going here...




  Last Thread   Next Thread Next
  Show Printable Version Email this Page Subscribe to this Thread      Post New Thread    Post A Reply      

    Let's get something going here...  
Dave Hinz


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
07-29-07 06:19 AM

How about a thread where the grizzled old-farts talk about the things
they wish they'd known when they started in the field.  I'll go first:

1. Users seldom describe the problem accurately.
2. If you don't have good backups, you don't have a supportable system.

So, what say we talk about what makes a system supportable.
Hardware/OS/infrastructure/etc ?


Dave







[ Post a follow-up to this message ]



    Re: Let's get something going here...  
Robert Melson


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
07-29-07 06:19 AM

In article <5h2bq3F3ikav1U2@mid.individual.net>,
Dave Hinz <DaveHinz@gmail.com> writes:
> How about a thread where the grizzled old-farts talk about the things
> they wish they'd known when they started in the field.  I'll go first:
>
> 1. Users seldom describe the problem accurately.
> 2. If you don't have good backups, you don't have a supportable system.
>
> So, what say we talk about what makes a system supportable.
> Hardware/OS/infrastructure/etc ?
>
>
> Dave
>
>
Hmmmm.

Seems to me there's no such thing as an UNsupportable system,
given a reasonably good admin crew with overlapping
experiences.  Yeah, there are systems that are purposely
obscure, that are idiosyncratic, but most all *ix systems ARE
supportable under my criterion above.

So the question should, I think, be "what makes a good sysadmin"
(apart from a thick hide and an ability to stay awake and alert
for > 36 hours)?  Or, if you prefer, a good admin team.

Bob Melson

--
Robert G. Melson | Rio Grande MicroSolutions | El Paso, Texas
-----
"People unfit for freedom---who cannot do much with it---are
hungry for power." ---Eric Hoffer






[ Post a follow-up to this message ]



    Re: Let's get something going here...  
Mark Rafn


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
07-30-07 12:18 AM

Dave Hinz  <DaveHinz@gmail.com> wrote:
>1. Users seldom describe the problem accurately.

True.  It's our job to tease actual requirements from them.

>2. If you don't have good backups, you don't have a supportable system.

I'd add: If you haven't done an actual restore from backup, you don't have
good backups.  I've only had a few complete-loss events in my career, and in
about half of them we discover critical missing pieces.

>So, what say we talk about what makes a system supportable.
>Hardware/OS/infrastructure/etc ?

People and procedures make a system supportable.  The technical bits are
secondary.
--
Mark Rafn    dagon@dagon.net    <http://www.dagon.net/>





[ Post a follow-up to this message ]



    Re: Let's get something going here...  
Robert Melson


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
07-30-07 12:18 AM

In article <s5Uqi.11872$rR.7928@newsread2.news.pas.earthlink.net>,
melsonr@aragorn.rgmhome.net (Robert Melson) writes:
> In article <5h2bq3F3ikav1U2@mid.individual.net>,
> 	Dave Hinz <DaveHinz@gmail.com> writes: 
> Hmmmm.
>
> Seems to me there's no such thing as an UNsupportable system,
> given a reasonably good admin crew with overlapping
> experiences.  Yeah, there are systems that are purposely
> obscure, that are idiosyncratic, but most all *ix systems ARE
> supportable under my criterion above.
>
> So the question should, I think, be "what makes a good sysadmin"
> (apart from a thick hide and an ability to stay awake and alert
> for > 36 hours)?  Or, if you prefer, a good admin team.
>
> Bob Melson
>
I hate like hell to follow-up my follow-ups, but the throught
struck me that there ARE unsupportable systems - those whose
hardware has reached EOL and can't be replaced, systems from
defunct manufacturers (who remembers CSI or Prime, for
example?) that have ticked along for mumble years but are now
showing signs of their age and can't be upgraded.

But, apart from that specific sub-class of systems, there just
ain't no such animal as an unsupportable system, given the
right team and procedures.

Bob Melson

--
Robert G. Melson | Rio Grande MicroSolutions | El Paso, Texas
-----
"People unfit for freedom---who cannot do much with it---are
hungry for power." ---Eric Hoffer






[ Post a follow-up to this message ]



    Re: Let's get something going here...  
Doug Freyburger


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
07-31-07 12:19 AM

Dave Hinz <DaveH...@gmail.com> wrote:
>
> How about a thread where the grizzled old-farts talk about the things
> they wish they'd known when they started in the field.  I'll go first:

The cartoon-like thing of pulling the dentures and gumming
"In my day, sonny ...".  Yup, always fun stuff.

> 1. Users seldom describe the problem accurately.

One of the most rewarding challenges of the job is that
users make very random sounding requests.  The job of an
engineer isn't to do what users ask but to satisfy user
requirements.  That means figuring out from the pattern of
what the user is asking for what it is they are trying to
achieve.  It starts out frustrating but by the time the real
solution is delivered the user tends to be educated and
pleased.

> 2. If you don't have good backups, you don't have a supportable system.

If you don't keep up with technology the system descends
into an unsupportable state.

> So, what say we talk about what makes a system supportable.
> Hardware/OS/infrastructure/etc ?

An old joke is that you can tell the pioneers (early adopters)
because they are the ones face down in the ditch with an arrow
sticking out of their corpse.  The flip side of that coin is you can
tell the patch-and-upgrade avoiders because when you feel a
bump they are the ones who just got killed when you lapped
them without seeing them coming.

Technology moves.  Fail to have an upgrade cycle and eventually
you're screwed.  This tends to happen with software faster
than with hardware, but it happens with either.

I like to say that when you develop version 1 of a program you
learn what you wanted it to do, so version 2 is incompatible.
When you write version 2 you learn how you wanted to do it,
so version 3 is compatible.  But when you write version 3 you
learn that the march of technology has rendered the original
idea obsolete.  Consider that Solaris is now in version 10,
Oracle is releasing version 11, and GUIs in UNIX run on X11.
On the one hand it's hilarious how many versions have
happened.  On the other hand the original ideas had to have
been extremely good to survive that much evolution.






[ Post a follow-up to this message ]



    Re: Let's get something going here...  
Dave Hinz


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
07-31-07 06:22 AM

On Mon, 30 Jul 2007 15:11:53 -0700, Doug Freyburger <dfreybur@yahoo.com> wrote:
> Dave Hinz <DaveH...@gmail.com> wrote:
 
>
> One of the most rewarding challenges of the job is that
> users make very random sounding requests.  The job of an
> engineer isn't to do what users ask but to satisfy user
> requirements.  That means figuring out from the pattern of
> what the user is asking for what it is they are trying to
> achieve.  It starts out frustrating but by the time the real
> solution is delivered the user tends to be educated and
> pleased.

Right.  Often the question they ask, isn't the question they should be
asking.  The trick is to figure out what they really need, because it's
often not what they think they need.  The old "Tell me what you're
trying to do, not how you think I should help you do it" problem.
 
[vbcol=seagreen]
> If you don't keep up with technology the system descends
> into an unsupportable state.

Good point but, even with modern machines, if I don't have console
access to a server, it's hard to support.  If the vendor has EOSL'd the
hardware or OS, it's hard to support even if I have install media and
spare parts in my basement.  And, what was acceptable 10 years ago (not
having mirrored system drives, for instance) in a prod system, just
doesn't cut it in some envoironments today.
 
[vbcol=seagreen]
> An old joke is that you can tell the pioneers (early adopters)
> because they are the ones face down in the ditch with an arrow
> sticking out of their corpse.  The flip side of that coin is you can
> tell the patch-and-upgrade avoiders because when you feel a
> bump they are the ones who just got killed when you lapped
> them without seeing them coming.

But, how do you get the application owners to give you a window to
patch?  If you have ideas on that I'd love to hear it.

> Oracle is releasing version 11, and GUIs in UNIX run on X11.
> On the one hand it's hilarious how many versions have
> happened.  On the other hand the original ideas had to have
> been extremely good to survive that much evolution.

A lot to be said for continuous improvement.  It's always interesting
when I get onto a Solaris 5.5.1 system for instance, and have to play
the "OK what don't I have to work with yet?" game.






[ Post a follow-up to this message ]



    Re: Let's get something going here...  
Doug Freyburger


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
07-31-07 06:20 PM

Dave Hinz <DaveH...@gmail.com> wrote:
> Doug Freyburger <dfrey...@yahoo.com> wrote: 
> 
> 
>
> But, how do you get the application owners to give you a window to
> patch?  If you have ideas on that I'd love to hear it.

Build up a track record with the users so they take you
seriously.  Tell of the place that failed to upgrade and ended
up screwed.  Get plenty of sympathy first, then ease up
the management chain once there's some support at the
bottom.

It's like the case of random seeming questions.  The usual
reason against a window doesn't actually involve their
current work cycle.  It's because since they haven't kept up
they have fallen behind.  "Don't take the system down for an
upgrade".  "Really?  What are you really trying to do?"
"Work through some bugs in the app so there's less down
time."  "Really?  Take a look at this table that shows what
bugs are handled.  Any of those ones you're having problems
with?"  "Hmmm, look at those ones."  "Yup, two weekends
from now a good time or is three weekends from now better?
Le'ts take a look at your calendar and your vacation
schedule.  What?  Rolling off vacation?  So two weekends
from now would save you lost vacation days in addition to
smotthing your work?  Cool stuff.  Let's do that".

And still one of my clients has Solaris 2.6, Oracle 7.x,
no web access to download patches, you name it.  When
it hits the fan they are seriously screwed.  But at least they
know it and understand the billable hours it will eventually
take to pull them out of the pit they gradually climbed into ...






[ Post a follow-up to this message ]



    Re: Let's get something going here...  
Mark Bartelt


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
08-30-07 06:20 PM

[ Dave Hinz ]

||  If you don't have good backups, you don't have a supportable system.

[ Mark Rafn ]

||  I'd add: If you haven't done an actual restore from backup, you don't ha
ve
||  good backups.

Indeed.  I know of one faculty member (at another university)
who lost many months worth of work, because although backups
of his workstation had been getting done on a regular basis
(by a "grad student doubling as sysadmin", a far-too-common
situation at many universities), nobody had ever checked to
confirm that the backups being produced were useful.  Then
his workstation's hard drive bit the dust.  And then it was
discovered that all the backups were totally useless.

I've occasionally been asked "how can I confirm whether our
sysadmins are doing their jobs correctly?"; one thing I've
often suggested is that the person who posed the question
should "accidentally delete" (rename, or move to another
directory) some important file, and ask their sysadmin to
restore it from the most recent backup.

Personally, anything that I consider really important is
kept in at least three different places (at least one of
which is a significant distance from where I work).  That
way, nothing short of a nuclear strike (well, possibly a
really big earthquake) will wipe out all the copies.  Of
course, this approach isn't much help for people who have
multi-terabyte datasets ...

---------------

Mark Bartelt
Lead system administrator
Center for Advanced Computing Research
California Institute of Technology
Pasadena, California  91125

626 395 2522
626 584 5917 fax
626 628 3994 e-fax

mark@cacr.caltech.edu

http://www.cacr.caltech.edu/~mark





[ Post a follow-up to this message ]



    Re: Let's get something going here...  
Moe Trin


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
09-01-07 12:17 AM

On Thu, 30 Aug 2007, in the Usenet newsgroup comp.unix.admin, in article
<slrnfddnkr.90c.mark@atlantis.cacr.caltech.edu>, Mark Bartelt wrote:

>[ Dave Hinz ]
>
>||  If you don't have good backups, you don't have a supportable system.
>
>[ Mark Rafn ]
>
>||  I'd add: If you haven't done an actual restore from backup, you
>||  don't have good backups.
>
>Indeed.  I know of one faculty member (at another university)
>who lost many months worth of work, because although backups
>of his workstation had been getting done on a regular basis
>(by a "grad student doubling as sysadmin", a far-too-common
>situation at many universities), nobody had ever checked to
>confirm that the backups being produced were useful.

I think that everyone who is responsible for backups, and has been in
the business for more than a year has heard dozens if not hundreds of
horror stories like this - including the bit about the tapes being
stored in a magnetically unfriendly place, the obsolete hardware, and
the tape drive that shredded the tapes due to a EOT sensor failure.

>Personally, anything that I consider really important is
>kept in at least three different places (at least one of
>which is a significant distance from where I work).  That
>way, nothing short of a nuclear strike (well, possibly a
>really big earthquake) will wipe out all the copies.

At the official level, a company should have arrangements that has the
backups far enough away that a survivable catastrophe (you probably
don't have to worry about restoring anything if a thousand ton meteor
hits the 'Hollywood' sign) that is near enough to be accessible in a
timely manner, and secure from vermin, crackers, and thieves.

At the _personal_ level, do you have friends/relatives/what-ever in a
distant location?  At home, my nightly backups go to a backup server
in the house, where they are encrypted and compressed - that's the
'local' backup. At Oh-Dark-Thirty, my sister who lives on the other
side of the continent (only 2200 miles/3500 KM away) and I do an
'rsync' of the backup servers to each other.

>Of course, this approach isn't much help for people who have
>multi-terabyte datasets ...

That might depend on the dataset, and how dynamic it is. The old idea
of doing an 'incremental' backup along with an occasional 'clean-slate'
full backup might work.

Old guy





[ Post a follow-up to this message ]



    Sponsored Links  




 





   All times are GMT. The time now is 10:34 AM.      Post New Thread    Post A Reply      
  Last Thread   Next Thread Next


Most Popular forums 

Forum Jump:
Rate This Thread:

Forum Rules:
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is OFF
vB code is ON
Smilies are ON
[IMG] code is OFF
 
Medical and Health forum | Computer Games Reviews | Graphics design forum

Back To The Top
Home | Usercp | Faq | Register