PC Review


Reply
Thread Tools Rate Thread

Avoiding duplicate files using ln

 
 
brigman
Guest
Posts: n/a
 
      27th Mar 2008
A bit of a wish list, I know, but does anyone know of a tool, that can
anlyse an NTFS filesystem and create an instance of any duplicate file
in a sperate folder to link the other instances to using ln.

For example, consider the directory listings:

D:\notebook01\c_drive_backup\file01.doc
D:\notebook01\c_drive_backup\file02.doc
D:\notebook01\c_drive_backup\file13.doc
D:\notebook01\c_drive_backup\file16.doc

and:

D:\notebook02\c_drive_backup\file03.doc
D:\notebook02\c_drive_backup\file04.doc
D:\notebook02\c_drive_backup\file13.doc
D:\notebook02\c_drive_backup\file19.doc

If it were found that the file common to both lists (file13.doc) had
the same size and md5 hash we could create a folder with that hash
name, copy the file into it and link the originals to it with ln for
single instance storage. The other issue is to mop up 'orphans' that
are no longer referenced elsewhere in the filesystem.

Obviously there exists filesystems like Sun Microsystems ZFS which can
take care of this stuff for you on the block level, and indeed, you
can run NTFS over the top of ZFS, but the point here is to do it
cheap.

Any advice appreciated.

Denys Williams
 
Reply With Quote
 
 
 
Reply

Thread Tools
Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Avoiding Duplicate Insert =?Utf-8?B?U3ByaW5rcw==?= Microsoft Access Form Coding 2 13th Dec 2004 06:33 PM
avoiding duplicate in list chinaprasad Microsoft Excel Discussion 2 7th Oct 2004 01:01 PM
Avoiding duplicate entries paul Microsoft Access Form Coding 1 16th Aug 2004 12:21 PM
Using VBA: Avoiding Writing Down Duplicate Solutions Michael Microsoft Excel Programming 1 25th May 2004 02:30 AM
avoiding duplicate emails Beemer Microsoft Outlook 1 10th Feb 2004 04:35 PM


Features
 

Advertising
 

Newsgroups
 


All times are GMT +1. The time now is 08:16 AM.