Data Storage - Database disk layout in large SAN environment

This is Interesting: Free IT Magazines  
Home > Archive > Data Storage > March 2006 > Database disk layout in large SAN environment





You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

Author Database disk layout in large SAN environment
Stunster

2006-03-21, 2:55 am

I am a storage admin for a large SAN environment. I am having an
ongoing debate with the Solaris admins about the best way to layout a
new Oracle database. We have a HDS 9980V, some high end SUN box with
multiple HBAs connected to multiple Host Ports on the 9980. Veritas VM
provides DMP to all LUNS. The debate is around whether we stripe the
database data file accross *many* LUNS for *performance*... or one LUN
for simplicity.

I am debatig that "many luns" = perfomance. I beleive in this day and
age with high end SANs that stripe, cache and mirror within the
subsystem, then no longer is a need to stripe in the OS. I am trying to
convince the Sys Admins that when we start looking at replicating
things to remote datacentres, multiple LUNs in consistancy groups, most
likely having different characteristics will be very difficult to
manage.

I want to have the following layout:
1 LUN - Oracle code
1 LUN - for each database data
1 LUN - for each database indexes
1 LUN - tmp/sort
1 LUN - Logs

That way I can synchronously replicate(mirror) to my DR site the Logs.
Asynchronously replicate the indexes and data and do nothing with the
rest.

Globe Treader

2006-03-21, 2:55 am

your arguments seem valid to me.

Striping data in the OS means that the host CPU performs data chunking,
parity calculation, and other RAID management functions. The CPU also
has to cope up with application processing.

Now once the data goes to storage, its again processed there. This
double processing comes with many overheads and context switching on
the CPU and can atleast theorotically slow you down.

such double RAID, however can be used in some situations as given
below.

suppose you wish to replicate data for DR purpose but you do not wish
to interconnect your production and DR SAN. in this case, you can
create identical configuration on both storages and then present them
to the host. if these two luns are mirrored at the host level, you get
full redundancy in case of a storage location fails. and you will avoid
storage-storage replication hardware/software costs.

one example of bad design in case of host/storage level goofup is this
one. i came across many implementation where the administrator created
multiple LUNs on same physical disks and then asssigned them to the
host. host had addition volume management layer that took the same luns
and further RAIDed them. drawbacks of such system are evident by design
- too much processing and no redundancy advantage at storage level.

in short i can say that do not go for such double RAID unless you have
carefully considered performance at all the levels. for me its not
going to prove any faster.

would like to hear what others have to say...

Kiran Ghag
http://www.kiranghag.com

Faeandar

2006-03-23, 8:54 pm

On 20 Mar 2006 18:54:03 -0800, "Stunster" <dgholt@gmail.com> wrote:

>I am a storage admin for a large SAN environment. I am having an
>ongoing debate with the Solaris admins about the best way to layout a
>new Oracle database. We have a HDS 9980V, some high end SUN box with
>multiple HBAs connected to multiple Host Ports on the 9980. Veritas VM
>provides DMP to all LUNS. The debate is around whether we stripe the
>database data file accross *many* LUNS for *performance*... or one LUN
>for simplicity.
>
>I am debatig that "many luns" = perfomance. I beleive in this day and
>age with high end SANs that stripe, cache and mirror within the
>subsystem, then no longer is a need to stripe in the OS. I am trying to
>convince the Sys Admins that when we start looking at replicating
>things to remote datacentres, multiple LUNs in consistancy groups, most
>likely having different characteristics will be very difficult to
>manage.
>
>I want to have the following layout:
>1 LUN - Oracle code
>1 LUN - for each database data
>1 LUN - for each database indexes
>1 LUN - tmp/sort
>1 LUN - Logs
>
>That way I can synchronously replicate(mirror) to my DR site the Logs.
>Asynchronously replicate the indexes and data and do nothing with the
>rest.



it all depends on where the luns are. There is no question that, if
setup correctly, using multiple luns across ACP's will improve
performance over a single lun.

HDS does not stripe across multiple luns, they concatenate. So if you
present 5 lun's to a host it will fill up the first one before it
moves on to the second. This is capacity friendly but not performance
friendly.
In the case of your sysadmins they are saying they want to stripe
across those 5 lun's with their VM such that a write is striped across
all 5 luns, potentially. This is almost always a performance
improvement if the lun's are on different raid groups. Even if
they're on the same ACP pair you will still see a noticeable
improvement over the single lun version.

~F
Stunster

2006-03-23, 8:54 pm

Yes I agree, when you create a LUCE volume it concatenates across the
LDEVS. However each LDEV is striped across the physical disks. I dont
beleive on our current 9980V we are able to concat LDEVs from different
RAID groups (or control units).

Striping at a volume manager level LUNS that are already striped across
physical disks in the subsystem can lead to a double striping which if
not done correctly(which happens most of the time) can cause worse
performance. Not to mention the Caching algorithms will not be able to
detect sequential IO and work to the best of their ability.

I guess there are a few points here:
1 - IF done correctly, striping in LVM, LUNS from different RAID
groups, may perform better than a single concat LUN performed in the
disk subsystem. (There is a big IF there and can potentially work out
alot worse and a management nightmare)
2 - Striping does happen in the subsystem when presenting LDEVs. RAID 5
'write penalty' often isnt an issue with cache.
3 - The most simple solution is often the best. Instead of having
volume management done in the disk subsystem, then volume management
done in the OS, then volume management in the RDBMS.... All volume
management should happen in the disk subsystem. I beleive we should
present a single logical LUN for each data structure(type). Use no OS
volume management.

carmelomcc

2006-03-23, 8:54 pm

Also remember that striping the data across Raid 5 groups will kill
your Read performance. If the RDBMS is like normal DBs you will do 90%
read and 10% write. To optimize for writes will kill the reads.

mf

2006-03-23, 8:54 pm

The read/modify/write penalty happens on writes, not reads. Write
performance with RAID 5 is the problem, not read performance.

I agree with the previous opinions on multiple volume management
functions. You are far better keeping things simple where storing
(block level) virtualization is concerned. Change management is a
heckuva lot easy when you can put your finger on all the variables - as
opposed to crossing your fingers and hoping there wasn't something
important that you missed. Ask yourself this question - what situation
would you like to inherit if you were asked to step in and maintain or
repair something about this system. I know what I'd prefer.

One other point. VxVM was developed to address shortcomings in storage
subsystems that no longer exist. Essentially VxVM addressed JBOD and
provided a great way for Sys admins to make JBOD useful. In some ways
VxVM has outlived its usefulness (Bring on the howls and catcalls, I
know). But the stuff requires a great deal of skill and you sort of
have to put on THE MASK to make it do it magic stuff. And that is the
dual-edge strength and weakness of VxVM - you can do just about
anything with it, but you really have to know what you are doing and
its not like your average overworked admin who doesn't use it everyday
can remember how it all works when they have 50 other pieces of
minutiae in the stack about all the other wierd stuff that's gone on in
the last week.

Let the subsystem do the work for you. Its really good at it and no
committee of desperate admins even comes close.

Ed Wilts

2006-03-23, 8:54 pm

mf wrote:
> One other point. VxVM was developed to address shortcomings in storage
> subsystems that no longer exist. Essentially VxVM addressed JBOD and
> provided a great way for Sys admins to make JBOD useful. In some ways
> VxVM has outlived its usefulness (Bring on the howls and catcalls, I
> know).


Parts of VxVM are no longer required but parts are still fairly
important. For example, if you have mirrored storage in multiple data
centers, VxVM can prefer the reads to your local arrays. We use this
and it made a signficant improvement since our ISLs were no longer
saturated. Secondly, VxVM can be used to transparently move your
storage from one array to another. We used this functionality to
migrated from Symmetrix to EVA and again when we wanted to localize our
application to specific EVAs - simply do a software mirror to move the
storage.

> Let the subsystem do the work for you. Its really good at it and no
> committee of desperate admins even comes close.


If you have one subsystem, then I agree. If you have multiple arrays,
then life gets more interesting.

.../Ed

Faeandar

2006-03-23, 8:54 pm

On Wed, 22 Mar 2006 22:05:52 GMT, Faeandar <mr_castalot@yahoo.com>
wrote:

>On 20 Mar 2006 18:54:03 -0800, "Stunster" <dgholt@gmail.com> wrote:
>
>
>
>it all depends on where the luns are. There is no question that, if
>setup correctly, using multiple luns across ACP's will improve
>performance over a single lun.
>
>HDS does not stripe across multiple luns, they concatenate. So if you
>present 5 lun's to a host it will fill up the first one before it
>moves on to the second. This is capacity friendly but not performance
>friendly.
>In the case of your sysadmins they are saying they want to stripe
>across those 5 lun's with their VM such that a write is striped across
>all 5 luns, potentially. This is almost always a performance
>improvement if the lun's are on different raid groups. Even if
>they're on the same ACP pair you will still see a noticeable
>improvement over the single lun version.
>
>~F



So, some people brought up some very good points against the VM
striping. I was not advocating that it should be done just that this
was the viewpoint of your admins.

While it is correct that an LDEV in HDS is striped across multiple
drives it is limited to 7 drives at most, at least the last time I
looked (about 3 weeks ago). While this may feed your capacity
requirements it may very well not feed your performance requirements.
True, the HDS is fast. True you can put alot of cache in the box.
But if it's use is generic and the databases are just one of many
applications on it you can run into contention for all these cool
features. I've seen cache hit the 60% watermark and all IO stops for
a period of 10-30 seconds while it flushes everything out to disk.
This can be very bad for a perfomance database.

While cache is good, it only postpones disk-based performance issues.
And any dba will tell you the more spindles the better. And unless
you can lock the database file into cache it still holds true for
something as uber as the HDS 9980V or even Tagmastor.

~F
Sponsored Links






Free braindumps | Software forum | Database administration forum

Copyright 2003 - 2008 webservertalk.com