Logging Incremental Changes

M

Mac

I have several tables of thousands of records each. The data is
imported daily from excel sheets or flat files that I receive from the
data originator. Because the data set is so large, I can only keep the
full set of the current data. But it's become increasingly important
to find a way to log any changes that are made to the data from day to
day.

The best advice I've received so far is to try the DISTINCTROW
predicate in SQL. That would be great if we weren't talking about so
many records. Not only would I exceed the limits of Access in short
order, I would also be storing a lot of redundant data.

What I really would like to end up with is a log table for each main
table that is almost an exact copy. The difference would be that the
fields in the log table would only be populated if they are different
from the previous day's data.

For example, given an initial record for a particular employee in the
main table like the following:

tbl_emp
date_imported emp_id emp_loca emp_health emp_rating
09OCT2010 FB6K2 Georgia Green Green

After several weeks of imports that replace the original, the record
in the main table will look like the following:

tbl_emp
date_imported emp_id emp_loca emp_health emp_rating
21OCT2010 FB6K2 Florida Yellow Red

I want to end up with a change log that looks like the following:

tbl_emp_chg_log
date_imported emp_id emp_loca emp_health emp_rating
09OCT2010 FB6K2 Florida
11OCT2010 FB6K2 Yellow
15OCT2010 FB6K2 Red
21OCT2010 FB6K2 Yellow

If I were editing the data myself, I would just create a change
transaction log as there are a variety of strategies on the subject.
But I need a way to compare field for field changes during the import,
and then log only the fields that differ from the existing data,
thousand of records each day.

I have a couple of ideas to approach it, but I'm pretty sure there's a
more efficient way than FOR - EACHing through each field of each
record. It would not only take a long time, but I've already run into
problems writing code that tries to step through every field of every
record in such a large data set.

Even if someone that recognizes this as a known problem set and could
recommend search terms that would help lead me to the right path, I
would be so thankful. My searches and brainstorming are coming up
empty.
 
F

Frank H

Based upon this info and an assumption, here is an approach:
Assumption: in any one given Excel sheet, the date column is the same for
every record, and any given emp_id is unique.

Create a query to isolate the records that have changed, then append them to
the "main table".

Create a query that has the excel table on the left and the main table on
the right.
Between the tables, put a left outer join on all fields except the date.
(But do include the Excel date in the displayed fields).

This will cause the query to match up all records that are identical (except
for the date). And, it will still list all of the records that don't match
(i.e., have been changed).

Isolate the changed records by entering a criteria "null" (Access will
change it to is null when you hit enter) on any field on the RIGHT side. You
want to select only the records where the right side (the main table) does
not have a matching record.

Then create an append query that inserts these records into your main table.

I'm thinking that will do it.
 
T

Tom Wickerath

Hi Frank,

If Mac uses your suggestion, he may want to compare with only the last entry
in the main table. Otherwise, if you had a situation like this, where the
statuses were identical on two different days (11 OCT and 25 OCT), it seems
to me that you would miss the latest revision:

tbl_emp_chg_log
date_imported emp_id emp_loca emp_health emp_rating
09OCT2010 FB6K2 Florida
11OCT2010 FB6K2 Yellow
15OCT2010 FB6K2 Red
21OCT2010 FB6K2 Yellow
25OCT2010 FB6K2 Yellow


Tom Wickerath
Microsoft Access MVP
http://www.accessmvp.com/TWickerath/
__________________________________________
 
M

Macguhlew

Based upon this info and an assumption, here is an approach:
Assumption: in any one given Excel sheet, the date column is the same for
every record, and any given emp_id is unique.

Create a query to isolate the records that have changed, then append themto
the "main table".

Thanks for the advice, but that's not quite what I'm trying to do. As
far as the main table goes, it's a straight forward process that goes
something like this:
Import (TransferExcel OR TransferText) New Records into Temp Table
Run Cleanup Routines on Temp Table
Delete * From Main Table
Move New Records From Temp to Main

What I need to do is insert some routine in between steps 2 and 3 to
analyze and log only the changes between yesterday's data and today's.
The best method I can figure so far would go something like this:

Open Recordset1 (Select * From Main)
With That Recordset1
For Each Record
Open Recordset2 (Select * From temp_Main Where
temp_Main.KeyField = Main.KeyField)
For Each Field in Recordset1
If Recordset1.Field <> Recordset2.Field Then
Add Recordset1.Field.Value to SQLInsertStatement
Else
Add EmptyString to SQLInsertStatement
EndIf
Next Field
Close Recordset2
Run SQLInsertStatement on Main_Log Table
Next Record
End With
Close Recordset1

Is that a better explanation? I'm not trying to identify which records
have changed and add them to main. That takes longer than just dumping
the current main and inserting the new. I'm trying to log any changes
to a record, field-for-field, only on fields that have changed. But at
the end of the day, the Main table is a carbon copy of the source data
I received.

I've also figured out a way to accomplish with SQL, but that is even
more clunky and would require rewriting the SQL for every table I need
to do it for. Not to mention it would be huge and hard to maintain/
troubleshoot.

The method I described above will work, but I'm pretty sure that
there's a simpler SQL method or VBA algorithm that works more
efficiently than what I described above. I just don't know what that
is.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top