B
bananix
Here's a problem I have
I have two big data files like 2gb which are different in size and
large parts of the files contain identical datablocks which are
scrambled in different offsets of the file...now the program should be
able analyze the differences and also be able to identify identical
datablocks in different parts of the file so it could generate a really
small patch file which can be used to make the other file identical
with the other. The patch could be executable or file that must be
aplied with the patch program. Java version would be nice so it would
work on many platforms.
Perhaps this could be implemented by making many checksums of sertain
sized blocks of the both files and then compare if there are identical
datablocks.
Does anyone have insight into this thing or some projects that might be
relevant or where I could post about this to get attention so perhaps
someone could make a program like that...
I have two big data files like 2gb which are different in size and
large parts of the files contain identical datablocks which are
scrambled in different offsets of the file...now the program should be
able analyze the differences and also be able to identify identical
datablocks in different parts of the file so it could generate a really
small patch file which can be used to make the other file identical
with the other. The patch could be executable or file that must be
aplied with the patch program. Java version would be nice so it would
work on many platforms.
Perhaps this could be implemented by making many checksums of sertain
sized blocks of the both files and then compare if there are identical
datablocks.
Does anyone have insight into this thing or some projects that might be
relevant or where I could post about this to get attention so perhaps
someone could make a program like that...