Data Storage - SCSI is limited to 2 terabytes

This is Interesting: Free IT Magazines  
Home > Archive > Data Storage > May 2004 > SCSI is limited to 2 terabytes





You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

Author SCSI is limited to 2 terabytes
Maurice Volaski

2004-05-30, 11:11 am

If you stick enough hard drives together, you can easily surpass 2
terabytes of combined space and if you try to connect them via SCSI,
you will apparently not see all your space.

That's because host bus adapters from Adaptec and LSI Logic can't see
anything larger than 2 terabytes.

I am using a Promise VTrak 15100 which uses SATA drives and a SCSI
controller. The Promise has no partitioning software and so my 3498 GB
RAID appears as 2048 GB to my Adaptec card.

What gives? Is everyone just using fibre channel?
Nik Simpson

2004-05-30, 11:11 am

Maurice Volaski wrote:
> If you stick enough hard drives together, you can easily surpass 2
> terabytes of combined space and if you try to connect them via SCSI,
> you will apparently not see all your space.
>
> That's because host bus adapters from Adaptec and LSI Logic can't see
> anything larger than 2 terabytes.


No, it's because the SCSI spec defines a 32bit address for each block and
each block is 512 bytes, end result is a 2TB limit for a single disk, or
stripe handled by the controller.

>
> I am using a Promise VTrak 15100 which uses SATA drives and a SCSI
> controller. The Promise has no partitioning software and so my 3498 GB
> RAID appears as 2048 GB to my Adaptec card.
>

Break the stripe into a pair of 1.7TB RAIDs, assign each one to the host as
a seperate physical disk, then stripe them together at the host level to
create a 3.4TB logical volume.

> What gives? Is everyone just using fibre channel?


Fibre channel is the same, a 32bit address space with 512byte blocks.


--
Nik Simpson



Anton Rang

2004-05-30, 11:11 am

"Nik Simpson" <n_simpson@bellsouth.net> writes:
> Maurice Volaski wrote:
>
> No, it's because the SCSI spec defines a 32bit address for each block and
> each block is 512 bytes, end result is a 2TB limit for a single disk, or
> stripe handled by the controller.


There's a newer SCSI command format which supports longer block addresses.

I don't know which cards (actually drivers) implement it so far.

-- Anton
Nik Simpson

2004-05-30, 11:11 am

Anton Rang wrote:
> "Nik Simpson" <n_simpson@bellsouth.net> writes:
>
> There's a newer SCSI command format which supports longer block
> addresses.
>
> I don't know which cards (actually drivers) implement it so far.
>

It's probably an OS level issue as well, not just the device driver on the
adapter, i.e. the OS has got to have some way to map a logical block address
on to something that requires a greater than 32bit physical address. Anyway,
regardless its a long way from universal support. The easiest way around the
problem is to use OS level volume managers to stripe across multiple 2TB
LUNs.


--
Nik Simpson


Thor Lancelot Simon

2004-05-30, 11:11 am

In article <K0Qlc.70$UG2.5@bignews2.bellsouth.net>,
Nik Simpson <n_simpson@bellsouth.net> wrote:
>Maurice Volaski wrote:
>
>No, it's because the SCSI spec defines a 32bit address for each block and
>each block is 512 bytes, end result is a 2TB limit for a single disk, or
>stripe handled by the controller.


That's false. Both larger block addresses and larger block sizes are
supported -- both by the specification *and* by many devices.

You should contact your HBA and OS vendors for appropriate fixes; here,
now, in 2004, code that can't cope with disks that big is just plain
broken.

--
Thor Lancelot Simon tls@rek.tjls.com
But as he knew no bad language, he had called him all the names of common
objects that he could think of, and had screamed: "You lamp! You towel! You
plate!" and so on. --Sigmund Freud
Nik Simpson

2004-05-30, 11:11 am

Thor Lancelot Simon wrote:
> In article <K0Qlc.70$UG2.5@bignews2.bellsouth.net>,
> Nik Simpson <n_simpson@bellsouth.net> wrote:
>
> That's false. Both larger block addresses and larger block sizes are
> supported -- both by the specification *and* by many devices.


OK, which deviuces, and which OSes, shouldn't be that hard to come up with a
list if they are so "common"

>
> You should contact your HBA and OS vendors for appropriate fixes;
> here, now, in 2004, code that can't cope with disks that big is just
> plain broken.


It may not be a hard limit, buts it's a pretty common one, how many OS do
you know that don't have a 2TB limit for a single drive/LUN, heck until
relatively recently Solaris had a 1TB limit. While it maybe possible, to
exceed the 2TB limit its certainly not common practice and its not a big
deal for the OS or HBA vendors since customers wanting single disks >2TB are
a very small minority.


--
Nik Simpson


flux

2004-05-30, 11:11 am

In article <K0Qlc.70$UG2.5@bignews2.bellsouth.net>,
"Nik Simpson" <n_simpson@bellsouth.net> wrote:

> Maurice Volaski wrote:
>
> No, it's because the SCSI spec defines a 32bit address for each block and
> each block is 512 bytes, end result is a 2TB limit for a single disk, or
> stripe handled by the controller.


I confirmed that this is untrue. The spec has been updated and now the
limit is 8 zettabytes!

I confirmed that Adaptec does not support this spec on any of its
hardware.

However, LSI Logic has contacted me and I got misinformation initially.
Their products do support the updated spec.

> Break the stripe into a pair of 1.7TB RAIDs, assign each one to the host as
> a seperate physical disk, then stripe them together at the host level to
> create a 3.4TB logical volume.


This is a moot point given that LSI is saying their stuff will work. The
VTrak is the RAID controller, and I have it implementing the RAID 5,
which is and should be its role in this. There is no way to create two
RAID 5 sets with the same amount of space as a single one and no way to
create logical volumes with their embedded software.

> It's probably an OS level issue as well, not just the device driver on the


Apparently not for Linux kernel 2.6.x.
Anton Rang

2004-05-30, 11:11 am

flux <support@fluxsoft.com> writes:
> I confirmed that Adaptec does not support this spec on any of its
> hardware.


Technically, the *hardware* supports it. The Windows drivers from
Adaptec (and the BIOS) do not. It would be a simple software update,
though I don't know if Microsoft supports large devices yet.

> However, LSI Logic has contacted me and I got misinformation initially.
> Their products do support the updated spec.


I think their RAID hardware does as well.

-- Anton
Adrian 'Dagurashibanipal' von Bidder

2004-05-30, 11:11 am

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Clinging to sanity, Nik Simpson mumbled in his beard:

[2TB block dev size limit]

> OK, which deviuces, and which OSes, shouldn't be that hard to come up with

a
> list if they are so "common"


I've seen that Linux 2.6 added a >2TB block device support kernel
configuration option. As I've not seen such a device up close, I can't say
how well this works, especially, if it works across all hardware drivers.

greets
- -- vbi

- --
You opted-in to receive these exciting offers by having an email
address. Our spam cannot be considered spam because of this
disclaimer. This is a one-time mailing. To be removed from future
one-time mailings, don't receive email.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: get my key from http://fortytwo.ch/gpg/92082481

iKcEARECAGcFAkCbNQRgGmh0dHA6Ly9mb3J0eXR3
by5jaC9sZWdhbC9ncGcvZW1h
aWwuMjAwMjA4MjI/ dmVyc2lvbj0xLjUmbWQ1c3VtPTVkZmY4NjhkMTE4
NDMyNzYw
NzFiMjVlYjcwMDZkYTNlAAoJEIukMYvlp/fW4iMAnRmtCsXvVqEi49/q7yI9Z4M7
pgvvAJwMaSx+ABH9ihNtslHaXJI+LCTv6w==
=Grhp
-----END PGP SIGNATURE-----
Christoph Hellwig

2004-05-30, 11:11 am

> > list if they are so "common"
>
> I've seen that Linux 2.6 added a >2TB block device support kernel
> configuration option. As I've not seen such a device up close, I can't say
> how well this works, especially, if it works across all hardware drivers.


SGI already has backported those changes to their shipping 2.4 kernels
because SGI customers expect that large volumes (and have been for a bit
of time on IRIX), usually those are XVM logical volumes and not scsi
luns, though.

For a SCSI HBA driver to support >2TB luns t has to support 16byte scsi
commands which only newer HBAs do. Except for that every driver should
be ready for that large volumes if the underlying hardware is ready for
it. For block level drivers there have been a few bugs in handling
of large device found (e.g. in the linux 'md' softraid driver lately),
but for SCSI HBA drivers there's very little chance to get it wrong
because they're not exposed to volume at all (except a little geometry
hack for msdos partition tables - but those don't work TB sizes luns
anyway)

Nik Simpson

2004-05-30, 11:11 am

Adrian 'Dagurashibanipal' von Bidder wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Clinging to sanity, Nik Simpson mumbled in his beard:
>

How did you know I had a beard :-)


--
Nik Simpson


Adrian 'Dagurashibanipal' von Bidder

2004-05-30, 11:11 am

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Clinging to sanity, Nik Simpson mumbled in his beard:
> How did you know I had a beard :-)


Well, and if you didn't have one, I'd claim a metaphorical beard for you, as
I can't well say you mumbled into your clean-shaved chin ... :-)

- -- vbi

- --
Today is Pungenday, the 55th day of Discord in the YOLD 3170

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: get my key from http://fortytwo.ch/gpg/92082481

iKcEARECAGcFAkCcq+pgGmh0dHA6Ly9mb3J0eXR3
by5jaC9sZWdhbC9ncGcvZW1h
aWwuMjAwMjA4MjI/ dmVyc2lvbj0xLjUmbWQ1c3VtPTVkZmY4NjhkMTE4
NDMyNzYw
NzFiMjVlYjcwMDZkYTNlAAoJEIukMYvlp/fWXewAoKlD8pZlECmDNJzzW9218bx2
X1bFAKDPP6iUtlgNwGUezJn0sjGZBWGvxA==
=s/C8
-----END PGP SIGNATURE-----
David Magda

2004-05-30, 11:11 am

"Nik Simpson" <n_simpson@bellsouth.net> writes:

>
> OK, which deviuces, and which OSes, shouldn't be that hard to come
> up with a list if they are so "common"


Well, UFS (from BSD, originally published in 1984) handles volumes up
to 1TB. UFS2 (as found in FreeBSD 5.x and NetBSD(?)) hangles larger
sizes.

If a 20 year file system can handle 1TB, what do you think an file
system released 2-3 years ago could handle?

> maybe possible, to exceed the 2TB limit its certainly not common
> practice and its not a big deal for the OS or HBA vendors since
> customers wanting single disks >2TB are a very small minority.


Perhaps. However you can pick up 3.5TB for US$ 11,000:

http://store.apple.com/1-800-MY-APP...mily=XserveRAID

Or 1TB for US$ 1,200:

http://www.lacie.com/products/product.htm?id=10118


How big are contemporary CAD files, or video files, or even
high-quality photographs from cameras like a Nikon/Canon? Never mind
some people's pr0n collection. :>

--
David Magda <dmagda at ee.ryerson.ca>, http://www.magda.ca/
Because the innovator has for enemies all those who have done well under
the old conditions, and lukewarm defenders in those who may do well
under the new. -- Niccolo Machiavelli, _The Prince_, Chapter VI
Nik Simpson

2004-05-30, 11:11 am

David Magda wrote:
> "Nik Simpson" <n_simpson@bellsouth.net> writes:
>
>
> Well, UFS (from BSD, originally published in 1984) handles volumes up
> to 1TB. UFS2 (as found in FreeBSD 5.x and NetBSD(?)) hangles larger
> sizes.
>
> If a 20 year file system can handle 1TB, what do you think an file
> system released 2-3 years ago could handle?
>


Handling a filesystem > 1TB and handling a device >1TB are not the same
thing, people have been able to create multi-TB logicial volumes (by
striping across several LUNs) for some time, hence the need for filesystem
that can be spread across the those logical volumes. NTFS for example has
supported multi-TB filesystems for years, doesn't mean that they are
particularly common in production environments.


>
> Perhaps. However you can pick up 3.5TB for US$ 11,000:
>
> http://store.apple.com/1-800-MY-APP...mily=XserveRAID
>
> Or 1TB for US$ 1,200:
>
> http://www.lacie.com/products/product.htm?id=10118
>


Of course you can have large amounts of storage for relatively little money,
doesn't mean that people are going out and creating multi-TB LUNs.

>
> How big are contemporary CAD files, or video files, or even
> high-quality photographs from cameras like a Nikon/Canon? Never mind
> some people's pr0n collection. :>


Again, in the vast majority of cases, not big enough to need multi-TB LUNs.

BTW, I do stand corrected on the SCSI limit issue, but I still don't think
that multi-TB LUNs are particualrly common.


--
Nik Simpson



Malcolm Weir

2004-05-30, 11:11 am

On 09 May 2004 21:36:58 -0400, David Magda
<dmagda+trace040423@ee.ryerson.ca> wrote:

[ Snip ]

>How big are contemporary CAD files, or video files, or even
>high-quality photographs from cameras like a Nikon/Canon? Never mind
>some people's pr0n collection. :>


What do you mean "even" photographs??? <g>

A busy night for me might consume about 15GB of raw image files (Kodak
Professional). Add about double or triple that for processed
versions. And then start over...

(A raw DCS file from my SLR/n is ~14MB).

However, while a TB file system would be nice, the things move offline
quickly, and rarely get updated, so I can actually work well with a
100GB filesystem. The large files get moved to DVD, and smaller JPGs
remain on disk for catalog purposes.

Malc.
Paul Repacholi

2004-05-30, 11:11 am

David Magda <dmagda+trace040423@ee.ryerson.ca> writes:

> How big are contemporary CAD files, or video files, or even
> high-quality photographs from cameras like a Nikon/Canon? Never mind
> some people's pr0n collection. :>


Well, the TASS project collects up to 10 CDs in a night per site, with
more to come.

A good digital back or drum scanner runs files up to a GB or so.

I don't think there is a limit on p0rn, but the Delft Uni collection
was said to be 700GB last I heard. Comp.risks should have the size in
the story of its destruction by fire.

--
Paul Repacholi 1 Crescent Rd.,
+61 (08) 9257-1001 Kalamunda.
West Australia 6076
comp.os.vms,- The Older, Grumpier Slashdot
Raw, Cooked or Well-done, it's all half baked.
EPIC, The Architecture of the future, always has been, always will be.
dave dickerson

2004-05-30, 11:11 am

Veritas is shipping a file system for UNIX/Linux which will address up
to 8 Exabytes. They need a volume manager to piece together the SCSI
devices into such a large address space.
Nik Simpson

2004-05-30, 11:11 am

dave dickerson wrote:
> Veritas is shipping a file system for UNIX/Linux which will address
> up to 8 Exabytes. They need a volume manager to piece together the
> SCSI devices into such a large address space.


And the Windows NTFS, as well as several other OS filesystems have been able
to do this for the best part of a decade or more, but max FS size != max LUN
size.


--
Nik Simpson


dave dickerson

2004-05-30, 11:11 am

Nik Simpson wrote:
> dave dickerson wrote:
>
>
>
> And the Windows NTFS, as well as several other OS filesystems have been able
> to do this for the best part of a decade or more, but max FS size != max LUN
> size.
>
>


Yeah, I see that - NTFS max size is 16 Exabytes ... I guess that means
if LDM or other volume management can build a logical disk that size
then bobs-your-uncle. Anyone know the max size of a Dynamic Disk?

What are the practical limits for NTFS? Certainly chkdsk time and
backups / restores come into play. Also the data structures of the meta
data stop performing well as the number of files becomes very large.

What's the largest you've seen? I've seen around 800 GBytes ( using EMC
Metavolumes ), but I suspect there are much larger in-the-wild.
Nik Simpson

2004-05-30, 11:11 am

dave dickerson wrote:
> Nik Simpson wrote:
>
> Yeah, I see that - NTFS max size is 16 Exabytes ... I guess that means
> if LDM or other volume management can build a logical disk that size
> then bobs-your-uncle. Anyone know the max size of a Dynamic Disk?


I beleive it's not fully implemented to support 16 EB, but is well north of
2TB.

>
> What are the practical limits for NTFS? Certainly chkdsk time and
> backups / restores come into play. Also the data structures of the
> meta data stop performing well as the number of files becomes very
> large.



That's a problem for all filesystems when getting into the 2TB+ range,
chkdsk/fsck and directory metadata parsing are going to be an issue. On the
whole NTFS is pretty good at reasonable recovery times and uses a fairly
sophisticated directory structure.

>
> What's the largest you've seen? I've seen around 800 GBytes ( using
> EMC Metavolumes ), but I suspect there are much larger in-the-wild.


I've seen customers with 2TB & greater, but not many, and often using sparse
voulme managment techniques underneath so that they don't actually need 2TB
up front and can add physical capacity as required without having to expand
the FS.


--
Nik Simpson





Net Worker

2004-05-30, 11:11 am

We have a Windows 2000 server that has 12 TB storage attached to it via a
SAN with 9 TB of data on it. The biggest volume is 3771 GB and the smallest
one is 1257 GB. No fancy Volume management being used, just standard windows
dynamic disks. We use this server as an alternate for tape archives. All the
arrays are ATA based and from NexSan. The SAN switch is from Qlogic. Works
great so far. Anything we should be worried about apart from the long chkdsk
times or RAID rebuild times.
TIA

"Nik Simpson" <n_simpson@bellsouth.net> wrote in message
news:l2Ssc.21335$Sc.9493@bignews1.bellsouth.net...
> dave dickerson wrote:
>
> I beleive it's not fully implemented to support 16 EB, but is well north

of
> 2TB.
>
>
>
> That's a problem for all filesystems when getting into the 2TB+ range,
> chkdsk/fsck and directory metadata parsing are going to be an issue. On

the
> whole NTFS is pretty good at reasonable recovery times and uses a fairly
> sophisticated directory structure.
>
>
> I've seen customers with 2TB & greater, but not many, and often using

sparse
> voulme managment techniques underneath so that they don't actually need

2TB
> up front and can add physical capacity as required without having to

expand
> the FS.
>
>
> --
> Nik Simpson
>
>
>
>
>



Nik Simpson

2004-05-30, 11:11 am

Net Worker wrote:
> We have a Windows 2000 server that has 12 TB storage attached to it
> via a SAN with 9 TB of data on it. The biggest volume is 3771 GB and
> the smallest one is 1257 GB. No fancy Volume management being used,
> just standard windows dynamic disks.


Just FYI, the dynamic disks in W2K are a "volume manager", just one
delivered with the OS rather than bought as a 3rd-party add-on, in fact the
code in W2K is a licensed (and cut down) version of the Veritas Volume
Manager for Windows.


--
Nik Simpson


removegcg1remove@psu.edu

2004-05-30, 11:11 am

Assuming these volumes are 'built' out of several LUNs on the SAN
storage...I've read that W2K places metadata on the physical disk somewhere so
it knows what order the put the LUNs in to rebuild the dynamic disk on a new
server. Can anyone tell me if this data is placed on ALL LUNs so no matter
which LUN the new server 'sees' first, it can properly rebuild the dynamic
volume from all those pieces? Our shop is likely going to do a study
regarding this and there is concern about how well we could recover a single
volume from several mirrored LUNs at the recovery site with a new server.
This is not clustered by the way.

thanks...
Gary

In article <vQ7tc.4246$3F.2196@newssvr32.news.prodigy.com>, "Net Worker"
<spam@nospam.com> wrote:
>We have a Windows 2000 server that has 12 TB storage attached to it via a
>SAN with 9 TB of data on it. The biggest volume is 3771 GB and the smallest
>one is 1257 GB. No fancy Volume management being used, just standard windows
>dynamic disks. We use this server as an alternate for tape archives. All the
>arrays are ATA based and from NexSan. The SAN switch is from Qlogic. Works
>great so far. Anything we should be worried about apart from the long chkdsk
>times or RAID rebuild times.
>TIA
>
>"Nik Simpson" <n_simpson@bellsouth.net> wrote in message
>news:l2Ssc.21335$Sc.9493@bignews1.bellsouth.net...
>of
>the
>sparse
>2TB
>expand
>
>

Nik Simpson

2004-05-30, 11:11 am

removegcg1remove@psu.edu wrote:
> Assuming these volumes are 'built' out of several LUNs on the SAN
> storage...I've read that W2K places metadata on the physical disk
> somewhere so it knows what order the put the LUNs in to rebuild the
> dynamic disk on a new server. Can anyone tell me if this data is
> placed on ALL LUNs so no matter which LUN the new server 'sees'
> first, it can properly rebuild the dynamic volume from all those
> pieces? Our shop is likely going to do a study regarding this and
> there is concern about how well we could recover a single volume from
> several mirrored LUNs at the recovery site with a new server. This is
> not clustered by the way.


Each physical disk that has been declared as "dynamic" has some space
allocated to hold information about the disks are arranged, so this
shouldn't be a problem at the remote site, you just have to "import" the
devices.


--
Nik Simpson


Malcolm Weir

2004-05-30, 11:11 am

On Thu, 27 May 2004 07:36:32 -0400, "Nik Simpson"
<n_simpson@bellsouth.net> wrote:

>removegcg1remove@psu.edu wrote:
>
>Each physical disk that has been declared as "dynamic" has some space
>allocated to hold information about the disks are arranged, so this
>shouldn't be a problem at the remote site, you just have to "import" the
>devices.


And the order in which the LUNs are discovered is irrelevant, since
Windows labels the disks as (e.g.) "this is disk 2 of 5 in set xyz".

I was horrified to discover (and hurt by the discovery) that there are
RAID controllers that don't do this; their configurations are based on
physical addressesl, so (e.g.) a mirrored set would be defined as "bus
1 ID 4 and bus 2 ID 5". Moving a disk breaks the mirror (not in
itself a problem... but the thing did the same with RAID sets, meaning
that if you (accidentally) swap the positions of two disks, your data
is instantly corrupted.

(Sure, the simple solution to this problem is ot to swap the
positions, but EEEK!)

Malc.
Sponsored Links






Free braindumps | Software forum | Database administration forum

Copyright 2003 - 2008 webservertalk.com