Compare two files

Morten Snedker · Dec 16, 2005

Hi folks,

Every 14 days a production file is read into a SQL-server. Each line
must be handled individually and managed by code.

The last file contains approx. 610,000 lines and gains +40,000 every
14 days.

Each line in the file is unique. Unfortunately, it is not possible for
the lines to have the same order from file to file, so I can't have
any marker.

Today I read the first ten characters of each line, since these make
up an ID telling me if this record already exist in my
SQL-server-table. But as you can imagine looking up +600,000 ID's
takes a tremendous amount of time, even though indexed.

So, what I was wondering: is it possible for me to compare the two
files in some fancy way? Any line in NewFile not in OldFile, write
these to a third file..?

Or perhaps something entirely different?

Regards /Snedker

tomb · Dec 16, 2005

Morten said:
Hi folks,

Every 14 days a production file is read into a SQL-server. Each line
must be handled individually and managed by code.

The last file contains approx. 610,000 lines and gains +40,000 every
14 days.

Each line in the file is unique. Unfortunately, it is not possible for
the lines to have the same order from file to file, so I can't have
any marker.

Today I read the first ten characters of each line, since these make
up an ID telling me if this record already exist in my
SQL-server-table. But as you can imagine looking up +600,000 ID's
takes a tremendous amount of time, even though indexed.

So, what I was wondering: is it possible for me to compare the two
files in some fancy way? Any line in NewFile not in OldFile, write
these to a third file..?

Or perhaps something entirely different?

Regards /Snedker

I recommend importing the text file into a temp table, then run a sql
compare statement between the two tables, do the work that needs to be
done, then empty the temp table. Let SQL do the work instead of line by
line code.

Tom

Lucky · Dec 16, 2005

hi morten,
for finding new row, ultimately you have to traverse through all the
rows. this can not be avoided. the solution "tomb" provided is good
enough to save you from writing the comparision logic on the front
side. but ti will take time to upload all the rows into the database.
if you can compromise on time than there is no issue. and even i would
prefer that way. now the problem of uploading lines into the database
is little bit tuff call as uploading these many lines is bit costly
affair. anyways do let me know if you have any other issue.

Cor Ligthert [MVP] · Dec 16, 2005

Morten,

There is no instruction Every New Line in Old file.

And if it was there, it would probably not go faster than that you write it
yourself, you can use all the same Net instructions as are used after the
cover.

You have in my opinion two options and both are using the DefaultView of
your tables, from which you have set the Sort property.

In my opinion is the most simple just to go through the newest table and use
the dataview.find. If you don't find it (-1), than it is new.

http://msdn.microsoft.com/library/d...tml/frlrfsystemdatadataviewclassfindtopic.asp
(be aware that there are two methods, one for with one key and another with
more keys)

An other option is going sequentially through both tables in the classic
matching way.

However I would first try the find, mostly these operations with less than
millions of rows takes much less than a second.

An alternative if you are able to that, is of course setting a
timestampcolumn in your SQL database, however I have assumed that this is
not possible..

I hope this helps,

Cor

Morten Snedker · Dec 16, 2005

On 15 Dec 2005 21:50:36 -0800, "Lucky" <[email protected]>
wrote:

First of all thanks to all of you for your time and input!

When you say "comparison logic", do you mean comparing the current
import with the previous import? Through joins?

If so, I fear the performance with a maximum length of a line at 1200
characters...

Or do you mean something else with "comparison logic" ?

Regards /Snedker

Lucky · Dec 17, 2005

well, i mean comparision value by value. than u'll come to know which
one is new

Excel Dragging countif formula with changing criteria	3	Oct 4, 2017
Find and Replace in Binary File	9	Feb 24, 2008
Compare two workbooks and hi-light matches	0	Apr 19, 2013
Comparing Two DataTables	2	May 24, 2006
Compare two columns	2	Apr 25, 2007
Importing Text Files Into Access	1	May 2, 2018
To read only newly added lines in a file	0	Aug 23, 2012
extract and compare	1	Dec 2, 2008

Compare two files

Morten Snedker

tomb

Lucky

Cor Ligthert [MVP]

Morten Snedker

Lucky

Ask a Question

Similar Threads