Unix Programming - Find a files location on the disk/filesystem

This is Interesting: Free IT Magazines  
Home > Archive > Unix Programming > January 2007 > Find a files location on the disk/filesystem





You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

Author Find a files location on the disk/filesystem
af300wsm@gmail.com

2007-01-22, 1:16 pm

Hi,

I have a situation in which I need to find a files location or address
if you will, on the file system. I'm looking for a way of actually
finding what blocks the file occupies rather than simply determining on
which file system it resides (or something similar).

I've used the system calls stat and fstat before, and I'm expecting
I'll be building on them. However, I see from the manual pages that
stat and fstat fill a stat structure and that none of the data members
of the stat structure actually store the block numbers, or at least
starting block number, of the file.

For example, a file wiping utility must determing a files exact
location on the filesystem to actually wipe the file contents. How is
this done?

Andy

Pascal Bourguignon

2007-01-22, 1:16 pm

af300wsm@gmail.com writes:
> I have a situation in which I need to find a files location or address
> if you will, on the file system. I'm looking for a way of actually
> finding what blocks the file occupies rather than simply determining on
> which file system it resides (or something similar).
>
> I've used the system calls stat and fstat before, and I'm expecting
> I'll be building on them. However, I see from the manual pages that
> stat and fstat fill a stat structure and that none of the data members
> of the stat structure actually store the block numbers, or at least
> starting block number, of the file.
>
> For example, a file wiping utility must determing a files exact
> location on the filesystem to actually wipe the file contents. How is
> this done?


Well, the only thing that stat can give you, is the devno and inode
numbers identifying the file. There is no standard ("unix") API to
find the offset of the file on the device. By the way, some files are
virtual, eg in the /proc filesystem on Linux.

Anyways, assuming a regular file on a normal hard disk partition, you
should be able to map the device number to the partition where the
file's stored (kernel-dependant), and then you can open that partition
device, (eg. something like /dev/hda5 on Linux; some fun can be had
with devices such as /dev/md0 and others too...). In general, to open
these devices, you need to have root access rights (or at least, be in
the disk group).

Once you've open the partition device, you can read and write blocs
(very dangerous a thing to do on a mounted file system, dangerous a
thing to do on an unmounted file system). The problem is that you
need to know what file system format is used, and to implement the
same file system routines (or extract them from the sources of your
kernel, or of fsck, if it's an opensource kernel).


Actually, the simpliest way to do that, would be to add a syscall in
the kernel to wipe the blocs of a file, to implement it in the kernel.
A second simple way would be to add a syscall in the kernel to report
the number of blocs and ranges allocated to an inode with respect to a
device.

"Blocs and ranges", because some file systems like reiserfs may put data
from different files in the same bloc.

"with respect to a device" because you may have layers of devices, a
file stored on a soft-RAID device /dev/md0 can use blocs 1000, 1001
and 1002 of /dev/md0, but these blocs may map to blocs 50, 51, 52 of
/dev/hda3 and 2070, 2071, and 2072 of /dev/hdc7.



--
__Pascal Bourguignon__ http://www.informatimago.com/

HEALTH WARNING: Care should be taken when lifting this product,
since its mass, and thus its weight, is dependent on its velocity
relative to the user.
Paul Pluzhnikov

2007-01-23, 7:31 am

af300wsm@gmail.com writes:

> I have a situation in which I need to find a files location or address
> if you will, on the file system. I'm looking for a way of actually
> finding what blocks the file occupies rather than simply determining on
> which file system it resides (or something similar).


This info generally is not available outside kernel, and even if
it was, it could change at a moment's notice -- some filesystems
move files around (when idle) to de-fragment themselves.

An additional complication is RAID, where the same data is spread
across several physical disks.

> I've used the system calls stat and fstat before, and I'm expecting
> I'll be building on them.


Your expectation is all wrong -- you can't get that info in user
space; you'll have to develop kernel modules (if your kernel
supports dynamically-loaded code).

I see that you are posting from Linux. Binary-only kernel modules
are difficult to do, because kernel interface keeps changing,
so you'll likely have to distribute source, and have the end-user
compile it.

> For example, a file wiping utility must determing a files exact
> location on the filesystem to actually wipe the file contents. How is
> this done?


You can overwrite file contents and *hope* that the new data is
written to the same physical disk blocks. If you really want to be
sure the data is gone, you wipe the whole disk partition (several
times), or you smash the drive with a sledge hammer.

Cheers,
--
In order to understand recursion you must first understand recursion.
Remove /-nsp/ for email.
Pascal Bourguignon

2007-01-23, 7:31 am

Paul Pluzhnikov <ppluzhnikov-nsp@charter.net> writes:
> You can overwrite file contents and *hope* that the new data is
> written to the same physical disk blocks. If you really want to be
> sure the data is gone, you wipe the whole disk partition (several
> times), or you smash the drive with a sledge hammer.


I'd rather melt it.

It's possible to read the data on a broken plate with lasers and
recover the data.

--
__Pascal Bourguignon__ http://www.informatimago.com/

This universe shipped by weight, not volume. Some expansion may have
occurred during shipment.
af300wsm@gmail.com

2007-01-23, 1:23 pm


Paul Pluzhnikov wrote:

> Your expectation is all wrong -- you can't get that info in user
> space; you'll have to develop kernel modules (if your kernel
> supports dynamically-loaded code).
>
> I see that you are posting from Linux. Binary-only kernel modules
> are difficult to do, because kernel interface keeps changing,
> so you'll likely have to distribute source, and have the end-user
> compile it.


Fortunately, this is not for Linux. My work experience over the last
couple of years has left me less than impressed with Linux, but that's
another story.

>
>
> You can overwrite file contents and *hope* that the new data is
> written to the same physical disk blocks. If you really want to be
> sure the data is gone, you wipe the whole disk partition (several
> times), or you smash the drive with a sledge hammer.


I'm wondering if the following would show me what I want to know with
respect to the system wiping files:

1) create a file with a very specific byte pattern
2) sync contents to disk
3) wipe, according to specifications
4) after wipe, unmount the volume
5) scan the volume using the block device for the patter written in the
file

Is this workable? Is this practicle? Some of the assumptions in this
is that the file be written in blocks equaling the size of a block in
the file system. Is this a bad assumption?

Basically, I'm thinking of writing a byte pattern in a 512 byte block.
Then, read the partition/filesystem in 512 byte blocks looking for this
pattern. Perhaps a larger byte block would be a better idea. Again,
is this idea workable and practicle?

Andy

Paul Pluzhnikov

2007-01-23, 8:03 pm

af300wsm@gmail.com writes:

> I'm wondering if the following would show me what I want to know with
> respect to the system wiping files:


I don't believe it will ...

> 1) create a file with a very specific byte pattern
> 2) sync contents to disk
> 3) wipe, according to specifications


According to what specifications?
I think you mean "wipe" the file by writing new data to it.

> 4) after wipe, unmount the volume
> 5) scan the volume using the block device for the patter written in the
> file
>
> Is this workable?


Not sure what your question means.

Can you do that? Sure.
Does not finding the pattern anywhere on disk prove anything? No.

> Some of the assumptions in this
> is that the file be written in blocks equaling the size of a block in
> the file system. Is this a bad assumption?


It is bad assumption for some filesystems (such as already mentioned
ReiserFS).

Cheers,
--
In order to understand recursion you must first understand recursion.
Remove /-nsp/ for email.
David Schwartz

2007-01-24, 1:16 pm


On Jan 22, 8:29 am, af300...@gmail.com wrote:

> I have a situation in which I need to find a files location or address
> if you will, on the file system. I'm looking for a way of actually
> finding what blocks the file occupies rather than simply determining on
> which file system it resides (or something similar).


This is not doable in general. For one thing, there's no guarantee the
file actually resides on the physical disk. For another thing, there's
no guarantee its data starts on a block boundary. Further, there's no
guarantee it will stay put. Lastly, there's no guarantee changing its
contents won't change it's size.

> For example, a file wiping utility must determing a files exact
> location on the filesystem to actually wipe the file contents. How is
> this done?


There are basically two ways to do this:

1) Overwrite the data with random data several times. Hope that the
filesystem actually overwrites the physical data.

2) Use special knowledge of the filesystem. For example, the filesystem
may have a 'secure delete' flag that you can set prior to unlinking the
file.

DS

af300wsm@gmail.com

2007-01-24, 1:16 pm



On Jan 23, 5:11 pm, Paul Pluzhnikov <ppluzhnikov-...@charter.net>
wrote:
> af300...@gmail.com writes:
>
> I don't believe it will ...


That's what I was wondering.

>
> I think you mean "wipe" the file by writing new data to it.
>
>
> Not sure what your question means.

Is this approach meaningful. From the below, apparently not.

>
> Can you do that? Sure.
> Does not finding the pattern anywhere on disk prove anything? No.
>
>
> It is bad assumption for some filesystems (such as already mentioned
> ReiserFS).


Ok, what about the file system for Lynx. I'm not working with ReiserFS
systems.

Thanks for all your input.

Andy

af300wsm@gmail.com

2007-01-24, 1:16 pm



On Jan 24, 7:08 am, "David Schwartz" <dav...@webmaster.com> wrote:
> On Jan 22, 8:29 am, af300...@gmail.com wrote:
>
> file actually resides on the physical disk.


Where else would it be? I'm guessing you're meaning that the file
contents are cached for a period of time before being written to disk,
or something similar. Correct?

> For another thing, there's
> no guarantee its data starts on a block boundary.


I was afraid of this one happening. I figured this could be a problem.

> Further, there's no
> guarantee it will stay put.


Is this due to house cleaning activities by the filesystem driver?

> Lastly, there's no guarantee changing its
> contents won't change it's size.
>
>
> 1) Overwrite the data with random data several times. Hope that the
> filesystem actually overwrites the physical data.
>
> 2) Use special knowledge of the filesystem. For example, the filesystem
> may have a 'secure delete' flag that you can set prior to unlinking the
> file.
>


Is there some online location for finding specifications on various
filesystems? For example, where could I find documentation on how
ReiserFS, or FLFS (I believe Fast Lynx File System), or FFS (the Fast
File System used by FreeBSD) are defined?

Thanks,
Andy

Sponsored Links






Free braindumps | Software forum | Database administration forum

Copyright 2003 - 2008 webservertalk.com