Duplicate file searching
Web Server forum
Back To The Forum Home!Search!Private Messaging System

Web Server Talk Web Server Talk > WebserverTalk Community > Data Storage > Duplicate file searching




  Last Thread   Next Thread Next
  Show Printable Version Email this Page Subscribe to this Thread      Post New Thread    Post A Reply      

    Duplicate file searching  
m0rk


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
09-17-05 10:56 PM

Can anyone recommend software for searching for true duplicate files
over the network ... were running everything with ms w2k and the users
are copying sets of files all over the place, much of them duplicates
but doing it by hand would be an impossible task.





[ Post a follow-up to this message ]



    Re: Duplicate file searching  
lenneis@wu-wien.ac.at


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
10-02-05 10:47 PM


m0rk :

> Can anyone recommend software for searching for true duplicate files
> over the network ... were running everything with ms w2k and the users
> are copying sets of files all over the place, much of them duplicates
> but doing it by hand would be an impossible task.

Generate a list of the files to be checked and run a checksum over
them, like sha1 or md5. Sort on the checksum and duplicates should be
listed in adjacent positions. A shortish PERL script could be used as
well.

--

Joerg Lenneis

email: lenneis@wu-wien.ac.at





[ Post a follow-up to this message ]



    Re: Duplicate file searching  
RPR


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
10-08-05 10:49 PM

#! /usr/bin/perl -w
# finddups.pl
# Lists duplicates in MD5 sums
# Use with find something -type f -print0 | xargs -i -0 md5sum "{}"
use strict;
$|=1;
my %h;
while(<> )
{   chomp $_;
#    print STDERR substr($_,0,70),qq(     \r);
my @a=split / /,$_,2;
push @{$h{$a[0]}},$a[1] if @a==2;
};
foreach(keys %h)
{   print join qq(\n),'',@{$h{$_}},'' if @{$h{$_}}>
1;
}






[ Post a follow-up to this message ]



    Sponsored Links  




 





   All times are GMT. The time now is 07:00 PM.      Post New Thread    Post A Reply      
  Last Thread   Next Thread Next


Most Popular forums 

Forum Jump:
Rate This Thread:

Forum Rules:
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is OFF
vB code is ON
Smilies are ON
[IMG] code is OFF
 
Medical and Health forum | Computer Games Reviews | Graphics design forum

Back To The Top
Home | Usercp | Faq | Register