Unix administration - rescue damaged tar file

This is Interesting: Free IT Magazines  
Home > Archive > Unix administration > February 2006 > rescue damaged tar file





You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

Author rescue damaged tar file
Michael Powe

2005-12-25, 5:50 pm


I have a compressed archive that I created using GNU tar on a gentoo
system. The tar file is about 900 MB compressed and about 2.5 GB
uncompressed.

This archive has bad data about 1/3 to 1/2 way through it. I'm trying
to find a way to get at the data below the damaged part. At the point
of failure, tar reports:

tar: skipping to next header
tar: archive contains obsolescent base-64 headers
tar: error exit delayed from previous errors

I used the gzip recovery kit to unpack the file. This worked and gave
me the 2.5 GB uncompressed file. However, tar will only unpack up to
the damaged part, and even with --ignore-failed-read, blows up.

cpio will not work, I tried this on the uncompressed file:

cpio -F tarfilename -i -v

The error returned is:

cpio: standard input is closed: Value too large for defined data type.

I have googled high and low and cannot find a relevant explanation of
the error in this context nor any resolution.

The same message is returned for

cpio -ird -H tar < tarfilename

I also tried the recovery kit's option to split the recovered file at
the point of corruption and save the "good" parts. However, it
doesn't split the file, it generates a message that says "bad data at
byte abc" immediately followed by "good data at byte abc" and then
keeps going, returning a single file.

Finally, I copied the file to a windows machine and opened it in
winzip, but winzip only opens it up to the point of corruption.

If anyone can tell me how I can recover this data, I will be eternally
grateful. I have looked at the "Advanced Tar Repair" tool, but of
course, I would like to avoid shelling out that kind of money for a
one-off. This is the only time I have had this problem in 8 years of
using tar, and I fully expect I could go that long again.

Thanks.

mp

--
'cat' is not recognized as an internal or external command,
operable program or batch file.
phil-news-nospam@ipal.net

2005-12-25, 8:48 pm

[followup set to comp.unix.admin only]

In comp.unix.admin Michael Powe <michael+gnus@trollope.org> wrote:

| I have a compressed archive that I created using GNU tar on a gentoo
| system. The tar file is about 900 MB compressed and about 2.5 GB
| uncompressed.
|
| This archive has bad data about 1/3 to 1/2 way through it. I'm trying
| to find a way to get at the data below the damaged part. At the point
| of failure, tar reports:

[etc]

Possibly the damage is in a form that effects the relative offset of the
data, for example the addition or deletion of an amount of data that is
not a multiple of 512 bytes. What a tar recovery program would have to
do is look for what appears to be a valid tar header using every level of
byte offset to find something. Assuming only one point of corruption,
this might be doable from the end of the file working in reverse to the
front. What is the exact number of bytes of the uncompressed copy?

--
-----------------------------------------------------------------------------
| Phil Howard KA9WGN | http://linuxhomepage.com/ http://ham.org/ |
| (first name) at ipal.net | http://phil.ipal.org/ http://ka9wgn.ham.org/ |
-----------------------------------------------------------------------------
Michael Powe

2005-12-26, 3:14 am

>>>>> "phil" == phil-news-nospam <phil-news-nospam@ipal.net> writes:

phil> [followup set to comp.unix.admin only]
phil> In comp.unix.admin Michael Powe <michael+gnus@trollope.org> wrote:

phil> | I have a compressed archive that I created using GNU tar
phil> on a gentoo | system. The tar file is about 900 MB
phil> compressed and about 2.5 GB | uncompressed.
phil> |
phil> | This archive has bad data about 1/3 to 1/2 way through it.
phil> I'm trying | to find a way to get at the data below the
phil> damaged part. At the point | of failure, tar reports:

phil> [etc]

phil> Possibly the damage is in a form that effects the relative
phil> offset of the data, for example the addition or deletion of
phil> an amount of data that is not a multiple of 512 bytes. What
phil> a tar recovery program would have to do is look for what
phil> appears to be a valid tar header using every level of byte
phil> offset to find something. Assuming only one point of
phil> corruption, this might be doable from the end of the file
phil> working in reverse to the front. What is the exact number
phil> of bytes of the uncompressed copy?

1010083339 2005-12-24 15:44 powem12242005.tar.gz
2469939203 2005-12-25 20:41 powem12242005.tar.recovered

The "recovered" file is generated by gzrecover from GNU Recovery Tool Kit
(http://www.urbanophile.com/arenn/coding/gzrt/gzrt.html). Standard
gzip will not unpack the file and exits with a CRC error.

BTW, if I had some pointers on analyzing these tar files, I have some
knowledge of Java and PERL and I'd be willing to spend some time
trying to "reverse engineer" the file, so to speak. I have no
experience handling binary files in that manner, though.

Thanks.

mp


--
Michael Powe michael@trollope.org Waterbury CT
ENOSIG: signature file is empty
Chuck F.

2005-12-26, 3:14 am

Michael Powe wrote:
>
> I have a compressed archive that I created using GNU tar on a
> gentoo system. The tar file is about 900 MB compressed and
> about 2.5 GB uncompressed.
>
> This archive has bad data about 1/3 to 1/2 way through it. I'm
> trying to find a way to get at the data below the damaged part.
> At the point of failure, tar reports:


I think you can forget about it. Modern compressors operate by
specifying a position and length of previous data to copy as new
data, using a window of from 4k to 64k or more into the old data
(measured from the point of expansion). Once the data is fouled
that reference is gone, and there is no way to recover.

The same applies to LZW compression, although the mechanism is
different. There there is a remote chance that the compressor has
detected a poor compression ratio and decided to reinitialize its
tables, in which case some recovery would be possible. However
that mechanism hasn't been used seriously for 20 years or so, due
to patent problems.

That is one advantage of the zip format, i.e. each individual
compressed file stands alone, so an error will only lose one file.
With the tar format all files are basically concatenated into
one, and the result may then be compressed.

To avoid this sort of foul up in the future, you would be well
advised to insist on ECC memory in all your systems.

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
More details at: <http://cfaj.freeshell.org/google/>
Dan Espen

2005-12-26, 3:14 am


Michael Powe wrote:
>
> I have a compressed archive that I created using GNU tar on a
> gentoo system. The tar file is about 900 MB compressed and
> about 2.5 GB uncompressed.
> This archive has bad data about 1/3 to 1/2 way through it. I'm
> trying to find a way to get at the data below the damaged part.
> At the point of failure, tar reports:


Google gives me hits on 'tar file recovery'.
The first one I looked at was commercial but seemed to have
a demo package.

I'd also try Shilling's 'star':

http://freshmeat.net/projects/star
Michael Powe

2005-12-26, 5:56 pm


I have looked at the various 'google' options but without joy. The
"Advanced Tar Recover" tool failed.

Thanks.

mp
Jules

2005-12-26, 5:56 pm

On Mon, 26 Dec 2005 11:40:26 -0500, Michael Powe wrote:

>
> I have looked at the various 'google' options but without joy. The
> "Advanced Tar Recover" tool failed.


Best option is to make absolutely certain that you can't get the damaged
section of the data back - if the reason for the corruption is due to the
source media (tape or disk say) then there are sometimes options there to
recover bad data...

And in the future, don't compress archives of anything critical, just
because of potential problems like this

cheers

Jules

Jean-David Beyer

2005-12-26, 5:56 pm

Jules wrote (in part):

> And in the future, don't compress archives of anything critical, just
> because of potential problems like this
>

How do tape drives that do _hardware compression_ get around this problem?

I imagine they compress one block at a time. On my tape drives, they have a
2 Megabyte buffer in the drive, and it normally writes 65536-byte blocks to
the tape. They employ a 4 layer Reed Solomon error detection and correction
scheme, and use what they call Adaptive Lossless Data Compression (whatever
that is). I am also not sure what "4 layer Reed Solomon error detection and
correction" is. I know they do lateral and longitudinal and diagonal parity
checking of each block. Reed Solomon technique is well known, but I do not
know what the 4-layer is all about.

http://www.siam.org/siamnews/mtc/mtc193.htm

I always have hardware compression on (it is the default for the drive), and
have never lost anything. But that is 2 data points (2 drives).

--
.~. Jean-David Beyer Registered Linux User 85642.
/V\ PGP-Key: 9A2FC99A Registered Machine 241939.
/( )\ Shrewsbury, New Jersey http://counter.li.org
^^-^^ 16:00:00 up 30 days, 2:32, 5 users, load average: 4.30, 4.23, 3.98
Michael Powe

2005-12-26, 5:56 pm

>>>>> "Jules" == Jules <julesrichardsonuk@remove.this.yahoo.co.uk> writes:

Jules> On Mon, 26 Dec 2005 11:40:26 -0500, Michael Powe wrote:[vbcol=seagreen]

Jules> Best option is to make absolutely certain that you can't
Jules> get the damaged section of the data back - if the reason
Jules> for the corruption is due to the source media (tape or disk
Jules> say) then there are sometimes options there to recover bad
Jules> data...

I have two copies of the archive, on two machines. (Both slackware
linux 10.2.) It exhibits the same behavior and file sizes on both
machines.

Jules> And in the future, don't compress archives of anything
Jules> critical, just because of potential problems like this

Yes, I'd gone so long without any problems that I was cocky. I
created the archive, scp'ed it to the other machine and blew away the
original. Because there was no problem with creating the archive, I
did not even think about a simple 'tar ztf' before proceeding. My bad
there, for sure.

Very frustrating. I've spent most of the day recreating the most
important configuration files that are "lost" in that archive. I
found an old, old backup so I didn't have to go back to scratch, but
still it is tedious. Good thing my wife is out all day doing her
"post Christmas" shopping. No "chores." ;-)

Thanks.

mp

--
Michael Powe michael@trollope.org Naugatuck CT USA
"When a person behaves in keeping with his conscience, when he
tries to speak as a citizen even under conditions where
citizenship is degraded, it may not lead to anything, yet it might.
But what surely will not lead to anything is when a person calculates
whether it will lead to something or not." -- Vaclav Havel, 1989
Chuck F.

2005-12-27, 2:50 am

Jean-David Beyer wrote:
> Jules wrote (in part):
>
> How do tape drives that do _hardware compression_ get around
> this problem?
>
> I imagine they compress one block at a time. On my tape drives,
> they have a 2 Megabyte buffer in the drive, and it normally
> writes 65536-byte blocks to the tape. They employ a 4 layer Reed
> Solomon error detection and correction scheme, and use what they
> call Adaptive Lossless Data Compression (whatever that is). I am
> also not sure what "4 layer Reed Solomon error detection and
> correction" is. I know they do lateral and longitudinal and
> diagonal parity checking of each block. Reed Solomon technique
> is well known, but I do not know what the 4-layer is all about.
>
> http://www.siam.org/siamnews/mtc/mtc193.htm
>
> I always have hardware compression on (it is the default for the
> drive), and have never lost anything. But that is 2 data points
> (2 drives).


You might look into using ARJ for this sort of archival
compression. It has provisions for generating the extra
information needed to recover from faulty media. I don't know the
algorithms used, nor have I really tested the result. It is free
for personal use, but may not be available for Linux.

<http://www.arjsoftware.com>

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
More details at: <http://cfaj.freeshell.org/google/>
Jean-David Beyer

2005-12-27, 6:06 pm

Chuck F. wrote:
> Jean-David Beyer wrote:
>
>
>
> You might look into using ARJ for this sort of archival compression. It
> has provisions for generating the extra information needed to recover
> from faulty media. I don't know the algorithms used, nor have I really
> tested the result. It is free for personal use, but may not be
> available for Linux.
>
> <http://www.arjsoftware.com>
>

I looked there, but they do not describe the algorithm used, so I cannot
tell if it is superior to the hardware algorithm used in my tape drive itself.

--
.~. Jean-David Beyer Registered Linux User 85642.
/V\ PGP-Key: 9A2FC99A Registered Machine 241939.
/( )\ Shrewsbury, New Jersey http://counter.li.org
^^-^^ 08:55:00 up 30 days, 19:27, 5 users, load average: 4.12, 4.18, 4.13
phil-news-nospam@ipal.net

2005-12-27, 6:06 pm

On 25 Dec 2005 22:19:22 -0500 Michael Powe <michael+gnus@trollope.org> wrote:
|>>>>> "phil" == phil-news-nospam <phil-news-nospam@ipal.net> writes:
|
| phil> [followup set to comp.unix.admin only]
| phil> In comp.unix.admin Michael Powe <michael+gnus@trollope.org> wrote:
|
| phil> | I have a compressed archive that I created using GNU tar
| phil> on a gentoo | system. The tar file is about 900 MB
| phil> compressed and about 2.5 GB | uncompressed.
| phil> |
| phil> | This archive has bad data about 1/3 to 1/2 way through it.
| phil> I'm trying | to find a way to get at the data below the
| phil> damaged part. At the point | of failure, tar reports:
|
| phil> [etc]
|
| phil> Possibly the damage is in a form that effects the relative
| phil> offset of the data, for example the addition or deletion of
| phil> an amount of data that is not a multiple of 512 bytes. What
| phil> a tar recovery program would have to do is look for what
| phil> appears to be a valid tar header using every level of byte
| phil> offset to find something. Assuming only one point of
| phil> corruption, this might be doable from the end of the file
| phil> working in reverse to the front. What is the exact number
| phil> of bytes of the uncompressed copy?
|
| 1010083339 2005-12-24 15:44 powem12242005.tar.gz
| 2469939203 2005-12-25 20:41 powem12242005.tar.recovered
|
| The "recovered" file is generated by gzrecover from GNU Recovery Tool Kit
| (http://www.urbanophile.com/arenn/coding/gzrt/gzrt.html). Standard
| gzip will not unpack the file and exits with a CRC error.
|
| BTW, if I had some pointers on analyzing these tar files, I have some
| knowledge of Java and PERL and I'd be willing to spend some time
| trying to "reverse engineer" the file, so to speak. I have no
| experience handling binary files in that manner, though.

If the damage was done to the compressed file, that could result in the
uncompressed data being totally corrupt. The recovery tool would be a
best effort attempt.

--
-----------------------------------------------------------------------------
| Phil Howard KA9WGN | http://linuxhomepage.com/ http://ham.org/ |
| (first name) at ipal.net | http://phil.ipal.org/ http://ka9wgn.ham.org/ |
-----------------------------------------------------------------------------
Doug Freyburger

2005-12-27, 6:06 pm

Michael Powe wrote:
>
> phil> | This archive has bad data about 1/3 to 1/2 way through it.
> phil> I'm trying | to find a way to get at the data below the
> phil> damaged part. At the point | of failure, tar reports:
>
> 1010083339 2005-12-24 15:44 powem12242005.tar.gz
> 2469939203 2005-12-25 20:41 powem12242005.tar.recovered


If the damage is in the tar phase not the compression phase, then
playing around with:

dd skip=N powem12242005.tar.recovered | tar tvf - | head

might help. It's a huge "if" and it would take a lot of fshing to
locate any header after the corrupt section. Especially not knowing
in advance if it's the compression so there isn't any good point
after the corruption.

phil-news-nospam@ipal.net

2005-12-27, 6:06 pm

On 27 Dec 2005 07:57:20 -0800 Doug Freyburger <dfreybur@yahoo.com> wrote:
| Michael Powe wrote:
|>
|> phil> | This archive has bad data about 1/3 to 1/2 way through it.
|> phil> I'm trying | to find a way to get at the data below the
|> phil> damaged part. At the point | of failure, tar reports:
|>
|> 1010083339 2005-12-24 15:44 powem12242005.tar.gz
|> 2469939203 2005-12-25 20:41 powem12242005.tar.recovered
|
| If the damage is in the tar phase not the compression phase, then
| playing around with:
|
| dd skip=N powem12242005.tar.recovered | tar tvf - | head
|
| might help. It's a huge "if" and it would take a lot of fshing to
| locate any header after the corrupt section. Especially not knowing
| in advance if it's the compression so there isn't any good point
| after the corruption.

The size of his recovered file, 2469939203 bytes, is: 4824100 * 512 + 3
I think the first issue is to found whereabout that extra 3 bytes is.
Then increment in 512 byte steps in alignment with the end of the file
to see where a valid tar header might be found, in the GZ recovery was
able to re-stablish decompression state correctly.

--
-----------------------------------------------------------------------------
| Phil Howard KA9WGN | http://linuxhomepage.com/ http://ham.org/ |
| (first name) at ipal.net | http://phil.ipal.org/ http://ka9wgn.ham.org/ |
-----------------------------------------------------------------------------
John DuBois

2005-12-27, 8:54 pm

In article <pan.2005.12.26.20.32.46.645320@remove.this.yahoo.co.uk>,
Jules <julesrichardsonuk@remove.this.yahoo.co.uk> wrote:
>On Mon, 26 Dec 2005 11:40:26 -0500, Michael Powe wrote:
>
>
>Best option is to make absolutely certain that you can't get the damaged
>section of the data back - if the reason for the corruption is due to the
>source media (tape or disk say) then there are sometimes options there to
>recover bad data...
>
>And in the future, don't compress archives of anything critical, just
>because of potential problems like this


Note that bzip & bzip2 do block compression (that's what the 'b' is for),
and bzip2recover lets you split a bzip2 file into those blocks so that you can
recover data from any that are not corrupted.

John
--
John DuBois spcecdt@armory.com KC6QKZ/AE http://www.armory.com/~spcecdt/
Michael Powe

2005-12-29, 8:52 pm

>>>>> "phil-news-nospam" == phil-news-nospam <phil-news-nospam@ipal.net> writes:

phil-news-nospam> On 27 Dec 2005 07:57:20 -0800 Doug Freyburger <dfreybur@yahoo.com> wrote:
phil-news-nospam> | Michael Powe wrote:
phil-news-nospam> |> |> phil> | This archive has bad data about
phil-news-nospam> 1/3 to 1/2 way through it. |> phil> I'm trying
phil-news-nospam> | to find a way to get at the data below the |>
phil-news-nospam> phil> damaged part. At the point | of failure,
phil-news-nospam> tar reports: |> |> 1010083339 2005-12-24 15:44
phil-news-nospam> powem12242005.tar.gz |> 2469939203 2005-12-25
phil-news-nospam> 20:41 powem12242005.tar.recovered | | If the
phil-news-nospam> damage is in the tar phase not the compression
phil-news-nospam> phase, then | playing around with: | | dd skip=N
phil-news-nospam> powem12242005.tar.recovered | tar tvf - | head |
phil-news-nospam> | might help. It's a huge "if" and it would
phil-news-nospam> take a lot of fshing to | locate any header
phil-news-nospam> after the corrupt section. Especially not
phil-news-nospam> knowing | in advance if it's the compression so
phil-news-nospam> there isn't any good point | after the
phil-news-nospam> corruption.

phil-news-nospam> The size of his recovered file, 2469939203
phil-news-nospam> bytes, is: 4824100 * 512 + 3 I think the first
phil-news-nospam> issue is to found whereabout that extra 3 bytes
phil-news-nospam> is. Then increment in 512 byte steps in
phil-news-nospam> alignment with the end of the file to see where
phil-news-nospam> a valid tar header might be found, in the GZ
phil-news-nospam> recovery was able to re-stablish decompression
phil-news-nospam> state correctly.

How can I tell what a tar header looks like? Where can I find out?
When I 'head -c 512' or 'tail -c 512' the file, I see a mixture of text from text
files and binary data, but I don't see anything identifiable.

(Linux 2.4.31~ellen) [powem] [ /home]
520 $ --> !head
head -c 512 powem12242005.tar.recovered
powem/#_ascp_bcgaw@cgaw.org_b_ahome_acgaw_apublic__html_alayout. html#00006000001750000014400000000000103
43042116026673 0ustar powemusers00000000000000

Thanks.

mp

--
Michael Powe michael@trollope.org Naugatuck CT USA

"The secret to strong security: less reliance on secrets."
-- Whitfield Diffie
Doug Freyburger

2005-12-30, 5:55 pm

Michael Powe wrote:
>
> How can I tell what a tar header looks like? Where can I find out?


man tar. Look in the see also section. man 4 tar
man file. Look in the see also section. man magic, more /etc/magic.

This is how UNIX documentation works. It takes a while to get
comfortable with how the man pages work.

> When I 'head -c 512' or 'tail -c 512' the file, I see a mixture of text from text
> files and binary data, but I don't see anything identifiable.


Yup, straight out of the section 4 doc on it. It needs binary for
format,
text because what good is a filename that isn't text.

Carl Lowenstein

2005-12-31, 7:49 am

In article <87fyobl8u0.fsf@ellen.trollope.org>,
>
>How can I tell what a tar header looks like? Where can I find out?
>When I 'head -c 512' or 'tail -c 512' the file, I see a mixture of text
>from text
>files and binary data, but I don't see anything identifiable.
>
>(Linux 2.4.31~ellen) [powem] [ /home]
> 520 $ --> !head
>head -c 512 powem12242005.tar.recovered
>powem/#_ascp_bcgaw@cgaw.org_b_ahome_acgaw_apublic__html_alayout. html#00006000001750000014400000000000103
43042116026673 0ustar powemusers00000000000000


In the olden days, man 5 tar would get you an explanation of the tar header.
This seems to be gone from anything touched by GNU.

Maybe you can find some other flavor of Unix system to look at. Or read
the source for GNU tar to get some clues.

carl

--
carl lowenstein marine physical lab u.c. san diego
clowenst@ucsd.edu
Bill Vermillion

2005-12-31, 5:54 pm

In article <dp50r3$g5a$1@news1.ucsd.edu>,
Carl Lowenstein <cdl@deeptow.ucsd.edu> wrote:
>In article <87fyobl8u0.fsf@ellen.trollope.org>,
[vbcol=seagreen]
>In the olden days, man 5 tar would get you an explanation of the tar header.
>This seems to be gone from anything touched by GNU.


>Maybe you can find some other flavor of Unix system to look at. Or read
>the source for GNU tar to get some clues.


man 5 tar on a FreeBSD system will list the tar structures for
four different tar implementations going back to the original tar,
ustar [Unix standard tar], gnutar, and one for spare-headers.

Bill
--
Bill Vermillion - bv @ wjv . com
ynotssor

2006-01-01, 6:01 pm

"Michael Powe" <michael+gnus@trollope.org> wrote in message
news:u64pdnebu.fsf@trollope.org

> I have a compressed archive that I created using GNU tar on a gentoo
> system. The tar file is about 900 MB compressed and about 2.5 GB
> uncompressed.


You will of course have verified that your filesystem is capable of sizes >
2 GB?

> This archive has bad data about 1/3 to 1/2 way through it. I'm trying
> to find a way to get at the data below the damaged part. At the point
> of failure, tar reports:
>
> tar: skipping to next header
> tar: archive contains obsolescent base-64 headers
> tar: error exit delayed from previous errors

....
> If anyone can tell me how I can recover this data, I will be eternally
> grateful.


You might try modifying the code (e.g., s:/dev/rfd0:/your/tar/filename: )
in
http://paxutils.progiciels-bpi.ca/s...lvaging&index=3 compile and run to see if that works around the corruption.

David Serrano (Hue-Bond)

2006-02-02, 5:54 pm

Michael Powe, dom20051225@20:53:25(CET):
>


Maybe I'm a bit late but anyway...


> This archive has bad data about 1/3 to 1/2 way through it. I'm trying
> to find a way to get at the data below the damaged part.


> I used the gzip recovery kit to unpack the file. This worked and gave
> me the 2.5 GB uncompressed file. However, tar will only unpack up to
> the damaged part, and even with --ignore-failed-read, blows up.


I was once in the same problem and found a PERL script in the web that made
my day. Let's reproduce a session:

$ tar cf bin.tar /bin/
tar: Removing leading `/' from member names
tar: Removing leading `/' from hard link targets
$ ll bin.tar
-rw------- 1 hue hue 2836480 20060202:233334+0100 bin.tar
$ dd if=/dev/urandom of=bin.tar bs=1k seek=1k count=300 conv=notrunc
300+0 records in
300+0 records out
307200 bytes (307 kB) copied, 0,101178 seconds, 3,0 MB/s
$ tar tf bin.tar
bin/
bin/bash
bin/rbash
bin/sh
bin/cat
bin/chgrp
bin/chmod
bin/chown
bin/cp
bin/date
bin/dd
bin/df
bin/dir
tar: Skipping to next header
tar: Archive contains obsolescent base-64 headers
tar: Error exit delayed from previous errors
$ PERL ~hue/lang/perl/find_tar_headers.pl bin.tar
bin.tar:0:bin/:0
bin.tar:512:bin/bash:2467624
bin.tar:685056:bin/rbash:0
bin.tar:685568:bin/sh:0
bin.tar:686080:bin/cat:40550
bin.tar:703488:bin/chgrp:100664
bin.tar:737280:bin/chmod:73560
bin.tar:768512:bin/chown:105250
bin.tar:804864:bin/cp:153524
bin.tar:860672:bin/date:126274
bin.tar:905728:bin/dd:113410
bin.tar:945152:bin/df:103344
bin.tar:980480:bin/dir:226110
bin.tar:1368064:bin/vdir:226110
bin.tar:1445888:bin/sleep:33104
bin.tar:1460736:bin/stty:110140
[...]
$ _

We can see that the offset of the next tar header is that of vdir, at
1368064. In order to make dd fast, we obtain offset in Kb:

$ bc
[...]
scale=4
1368064/1024
1336.0000
$ dd if=bin.tar of=bin-tail.tar bs=1k skip=1336
1434+0 records in
1434+0 records out
1468416 bytes (1,5 MB) copied, 0,017176 seconds, 85,5 MB/s
$ file bin-tail.tar
bin-tail.tar: POSIX tar archive
$ tar xf bin.tar
tar: Skipping to next header
tar: Archive contains obsolescent base-64 headers
tar: Error exit delayed from previous errors
$ tar xf bin-tail.tar
$ rm -f bin/dir ## probably corrupted

And now we have everything recoverable:

$ ls bin|wc -l
83
$ ls /bin/|wc -l
95
$ _

This is find_tar_headers.pl:

-----------------------------
#!/usr/bin/perl -w
use strict;

# 99.9% of all credits for this script go
# to Tore Skjellnes <torsk@elkraft.ntnu.no>
# who is the originator.

my $tarfile;
my $c;
my $hit;
my $header;

# if you don't get any results, outcomment the line below and
# decomment the line below the it and retry
my @src = (ord('u'),ord('s'),ord('t'),ord('a'),ord
('r'),ord(" "), ord(" "),0);
#my @src = (ord('u'),ord('s'),ord('t'),ord('a'),ord
('r'),0,ord('0'),ord('0'));

die "No tar file given on command line" if $#ARGV != 0;

$tarfile = $ARGV[0];

open(IN,$tarfile) or die "Could not open `$tarfile': $!";

$hit = 0;
$| = 1;
seek(IN,257,0) or die "Could not seek forward 257 characters in `$tarfile': $!";
while (read(IN,$c,1) == 1)
{
($hit = 0, next) unless (ord($c) == $src[$hit]);
$hit = $hit + 1;
( print "hit: $hit", next ) unless $hit > $#src;


# we have a probable header at (pos - 265)!
my $pos = tell(IN) - 265;
seek(IN,$pos,0)
or (warn "Could not seek to position $pos in `$tarfile': $!", next);

(read(IN,$header,512) == 512)
or (warn "Could not read 512 byte header at position $pos in `$tarfile': $!", seek(IN,$pos+265,0),next);

my ($name, $mode, $uid, $gid, $size, $mtime, $chksum, $typeflag,
$linkname, $magic, $version, $uname, $gname,
$devmajor, $devminor, $prefix)
= unpack (" Z100a8a8a8Z12a12a8a1a100a6a2a32a32a8a8Z1
55", $header);
$size = int $size;
printf("%s:%s:%s:%s\n",$tarfile,$pos,$name,$size);

$hit = 0;
}

close(IN) or warn "Error closing `$tarfile': $!";
-----------------------------

Good luck.


--
David Serrano
Sponsored Links






Free braindumps | Software forum | Database administration forum

Copyright 2003 - 2008 webservertalk.com