|
Home > Archive > Unix administration > April 2006 > tape backup block size
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
tape backup block size
|
|
| Damian Menscher 2006-04-27, 7:57 am |
| I'm curious how to select a reasonable block size for tape backups.
I have discovered that the default of 10K is incredibly slow (1M/s)
and that increasing it to 1024K goes much faster (8M/s). My
question is why they don't make the default larger? I can imagine
two possibilities:
- compatibility with older tape drives or operating systems
- each file takes a minimum of 1 block
But as I understand it, dump(1) savves each filesystem as a single
tape "file", so the larger blocksize shouldn't hurt. I guess I can
test this by creating a filesystem with a million 1-byte files, and
see how many copies of it fit on a tape.
If it matters, I'm using Linux with a 2.6 kernel (RHEL4) and a
TAIT-2 tape drive on a remote machine (connecting via ssh).
Damian Menscher
--
-=#| <menscher@uiuc.edu> www.uiuc.edu/~menscher/ Ofc 217)253-2757 |#=-
-=#| The above opinions are not necessarily those of my employers. |#=-
| |
| Michael Paoli 2006-04-27, 7:57 am |
| Damian Menscher wrote:
> I'm curious how to select a reasonable block size for tape backups.
> I have discovered that the default of 10K is incredibly slow (1M/s)
> and that increasing it to 1024K goes much faster (8M/s). My
> question is why they don't make the default larger? I can imagine
> two possibilities:
> - compatibility with older tape drives or operating systems
> - each file takes a minimum of 1 block
> But as I understand it, dump(1) savves each filesystem as a single
> tape "file", so the larger blocksize shouldn't hurt. I guess I can
> test this by creating a filesystem with a million 1-byte files, and
> see how many copies of it fit on a tape.
> If it matters, I'm using Linux with a 2.6 kernel (RHEL4) and a
> TAIT-2 tape drive on a remote machine (connecting via ssh).
Historically, the tape/archive block sizes were rather small - e.g. as
small as 512 bytes, and sometimes the maximum was only 10 KiB (20 512
byte "blocks"). In some cases, this might matter for backwards
compatibility.
With much more modern (current, non-ancient) hardware, generally
larger tape block sizes are faster and more efficient. Typically
using the largest tape block size supported by the hardware (or driver
dependencies, etc.) usually works out to be fastest and most efficient
- at least provided one can stream the data to the tape drive fast
enough to avoid underruns. Also, larger block sizes on most modern
tape drives generally don't end up wasting more space - even if the
data to be written is a quite small bit. Most notably, most all
modern tape drives include hardware compression, and backup utilities
typically pad any remaining space to a block boundary with nulls -
which compresses exceedingly well. And in most cases, where multiple
blocks are being written, less space is required on tape, as there is
a reduced need for space for start/end of block (typically tape
record marks) markers.
| |
| Doug Freyburger 2006-04-27, 7:57 am |
| Damian Menscher wrote:
>
> I'm curious how to select a reasonable block size for tape backups.
> I have discovered that the default of 10K is incredibly slow (1M/s)
> and that increasing it to 1024K goes much faster (8M/s). My
> question is why they don't make the default larger? I can imagine
> two possibilities:
>
> - compatibility with older tape drives or operating systems
> - each file takes a minimum of 1 block
Possibility 3 that you didn't consider -
Streaming tape technologies no longer have fixed block formats
so the blocking is virtual anyways.
The blocks in question were seekable. Streaming tape drives
have not supported that for quite some time. Are there once again
fixed block format technologies available like there were in the
days of reel to reel?
The timing difference is from buffering and any timing delays
caused by delivering blocks that loss stream. Figure out the
biggest buffer that fits, and use it.
Years ago I timed how long it took to write end to end on a
tape, and I doubled the blockzie each time. Early in the test
each write took half as long as the previous. Later in the test
all larger blocksizes took the same time. Classic roll-off
curve. I picked a nice big blocksize well along the curve and
never looked back. Since I did that with an old Exabyte 8mm
helical scan drive the blocksize I picked no longer matters.
What does matter is as long as a bigger blocksize helps,
go for it and feel free to benchmark your own performance curve.
| |
| Damian Menscher 2006-04-27, 7:57 am |
| Doug Freyburger <dfreybur@yahoo.com> wrote:
> Damian Menscher wrote:
[vbcol=seagreen]
> Streaming tape technologies no longer have fixed block formats
> so the blocking is virtual anyways.
> The blocks in question were seekable. Streaming tape drives
> have not supported that for quite some time. Are there once again
> fixed block format technologies available like there were in the
> days of reel to reel?
My previous drive used a fixed blocksize (Certance STT3401A), but
that was crap produced by Seagate/Certance. My current drive (a
Sony TAIT-2) uses variable block sizes. I see what you mean
about being seekable, though, as the -Q option to dump(1) doesn't
work with this drive. (Presumably it would have with my older
drive?)
> The timing difference is from buffering and any timing delays
> caused by delivering blocks that loss stream. Figure out the
> biggest buffer that fits, and use it.
> Years ago I timed how long it took to write end to end on a
> tape, and I doubled the blockzie each time. Early in the test
> each write took half as long as the previous. Later in the test
> all larger blocksizes took the same time. Classic roll-off
> curve. I picked a nice big blocksize well along the curve and
> never looked back. Since I did that with an old Exabyte 8mm
> helical scan drive the blocksize I picked no longer matters.
> What does matter is as long as a bigger blocksize helps,
> go for it and feel free to benchmark your own performance curve.
For the curious:
blksize speed (K/s)
(K) local remote /dev/null
10 7284 1172 17076
32 12426 5141
64 15054 5771
128 same 6344
256 6622
512 6765
1024 14389 6827 17764
So local dumps max out at 15M/s, which is the limitation of the
tape drive. Dumps to /dev/null max otu at 17M/s, so that's the
limit of the hard drives (crappy 3ware raid5...). Remote dumps
speed up significantly as I increase the blocksize. Given that
restores with a 1024K blocksize work, and Michael convinced me
that it's not wasting tape-space, I'm just running with that.
I think the remote backup must be hitting a network limitation
(100mbit networking) though I'd expect it to get 10M/s, not max out
at 7M/s. If anyone has ideas, I'd love to hear them.
Damian Menscher
--
-=#| <menscher@uiuc.edu> www.uiuc.edu/~menscher/ Ofc 650)253-2757 |#=-
-=#| The above opinions are not necessarily those of my employers. |#=-
|
|
|
|
|