Comparing columns in two otherwise identical Tables

P

PatK

I have created the following SQL query to attempt to find the differences
between two identically structured tables. What I do each day is take a
"snapshot" of today's Table. Tomorrow, I will then compare the now
"yesterday" file to today's table.

As a test, I created two identical tables, from real data, and just changed
a couple of the fields. This query is finding those just fine. However, it
is returning MANY other rows. I have come to the determination that that due
to the fact that many of the rows have NULL values in some fields, these is
causing them to be considered unmatched, ie, table1.field1(null) <>
table2.field1(null).

What is the best way to ensure that NULL fields are not kicking out of this
query? Thanks for any pointers!

PatK

SELECT [T003-Snapshot Table].*
FROM [T003-Snapshot Table] LEFT JOIN [Q010-Program-Tracker]
ON [T003-Snapshot Table].[UniqueKey]=[Q010-Program-Tracker].[UniqueKey]
AND [T003-Snapshot Table].[L2]=[Q010-Program-Tracker].[L2]
AND [T003-Snapshot Table].[L3]=[Q010-Program-Tracker].[L3]
AND [T003-Snapshot Table].[L4]=[Q010-Program-Tracker].[L4]
AND [T003-Snapshot
Table].[PortfolioID]=[Q010-Program-Tracker].[PortfolioID]
AND [T003-Snapshot
Table].[Application]=[Q010-Program-Tracker].[Application]
AND [T003-Snapshot Table].[PlanDate]=[Q010-Program-Tracker].[PlanDate]
AND [T003-Snapshot Table].[CompDate]=[Q010-Program-Tracker].[CompDate]
AND [T003-Snapshot Table].[Mandated?]=[Q010-Program-Tracker].[Mandated?]
AND [T003-Snapshot Table].[Completed?]=[Q010-Program-Tracker].[Completed?]
AND [T003-Snapshot
Table].[EligibleCount]=[Q010-Program-Tracker].[EligibleCount]
AND [T003-Snapshot
Table].[ScorecardCount]=[Q010-Program-Tracker].[ScorecardCount]
AND [T003-Snapshot
Table].[CompleteCount]=[Q010-Program-Tracker].[CompleteCount]

WHERE
[Q010-Program-Tracker].[UniqueKey] IS NULL;
 
J

Joseph Atie

im not expert but if i wanted to exclude null values from your query id give
this a go:

<> table2.field1 and is not null.

table1.field1 <> table2.field1 and table.field1
 
J

John Spencer

You could try using a non-equi join that would look like

SELECT [T003-Snapshot Table].*
FROM [T003-Snapshot Table] LEFT JOIN [Q010-Program-Tracker]
ON [T003-Snapshot Table].[UniqueKey]=[Q010-Program-Tracker].[UniqueKey]
AND ([T003-Snapshot Table].[L2]=[Q010-Program-Tracker].[L2]
Or ([T003-Snapshot Table].[L2] Is NULL AND
[Q010-Program-Tracker].[L2])) Is Null
WHERE
[Q010-Program-Tracker].[UniqueKey] IS NULL;

Or try forcing a a known value (of the proper type) if the field is null.
That means you would have to be sure to force a value that did not exist in
the tables. Perhaps a zero-length string, a date in 1899, and a negative
number smaller than any one in the table(s).

SELECT .*
FROM [T003-Snapshot Table] AS S LEFT JOIN [Q010-Program-Tracker] AS P
ON .[UniqueKey=[P].[UniqueKey]
AND IIF(.[L2] Is Null,"",.[L2])= IIF([P].[L2] Is Null,"",P.L2)

WHERE
[Q010-Program-Tracker].[UniqueKey] IS NULL;



John Spencer
Access MVP 2002-2005, 2007-2009
The Hilltop Institute
University of Maryland Baltimore County
I have created the following SQL query to attempt to find the differences
between two identically structured tables. What I do each day is take a
"snapshot" of today's Table. Tomorrow, I will then compare the now
"yesterday" file to today's table.

As a test, I created two identical tables, from real data, and just changed
a couple of the fields. This query is finding those just fine. However, it
is returning MANY other rows. I have come to the determination that that due
to the fact that many of the rows have NULL values in some fields, these is
causing them to be considered unmatched, ie, table1.field1(null) <>
table2.field1(null).

What is the best way to ensure that NULL fields are not kicking out of this
query? Thanks for any pointers!

PatK

SELECT [T003-Snapshot Table].*
FROM [T003-Snapshot Table] LEFT JOIN [Q010-Program-Tracker]
ON [T003-Snapshot Table].[UniqueKey]=[Q010-Program-Tracker].[UniqueKey]
AND [T003-Snapshot Table].[L2]=[Q010-Program-Tracker].[L2]
AND [T003-Snapshot Table].[L3]=[Q010-Program-Tracker].[L3]
AND [T003-Snapshot Table].[L4]=[Q010-Program-Tracker].[L4]
AND [T003-Snapshot
Table].[PortfolioID]=[Q010-Program-Tracker].[PortfolioID]
AND [T003-Snapshot
Table].[Application]=[Q010-Program-Tracker].[Application]
AND [T003-Snapshot Table].[PlanDate]=[Q010-Program-Tracker].[PlanDate]
AND [T003-Snapshot Table].[CompDate]=[Q010-Program-Tracker].[CompDate]
AND [T003-Snapshot Table].[Mandated?]=[Q010-Program-Tracker].[Mandated?]
AND [T003-Snapshot Table].[Completed?]=[Q010-Program-Tracker].[Completed?]
AND [T003-Snapshot
Table].[EligibleCount]=[Q010-Program-Tracker].[EligibleCount]
AND [T003-Snapshot
Table].[ScorecardCount]=[Q010-Program-Tracker].[ScorecardCount]
AND [T003-Snapshot
Table].[CompleteCount]=[Q010-Program-Tracker].[CompleteCount]

WHERE
[Q010-Program-Tracker].[UniqueKey] IS NULL;
 
P

PatK

Well...I made a run at it, John. Definitely considered "plugging" values
into some of the fields, to eliminate the nulls, but, for example, on Date
fields, what do you plug (and folks make decisions based upon this file).
So, while I was tempted, that approach would have a different set of problems.

Here is where I ended up, querywise. I get strange results, and definitely
not what I am expecting. Ie, my test data where I changed 3 rows in the
snapshot table, are not showing up. Not sure why. DOes it look correctly
structured?

SELECT "Field Changed" AS Reason, [T003-Snapshot Table].*
FROM [T003-Snapshot Table] LEFT JOIN [Q010-Program-Tracker] ON
([T003-Snapshot Table].[UniqueKey]=[Q010-Program-Tracker].[UniqueKey])

AND ([T003-Snapshot Table].[L3]=[Q010-Program-Tracker].[L3])
Or ([T003-Snapshot Table].[L3] is Null AND [Q010-Program-Tracker].[L3]
is NULL)

AND ([T003-Snapshot Table].[L4]=[Q010-Program-Tracker].[L4])
OR ([T003-Snapshot Table].[L4] is NULL AND [Q010-Program-Tracker].[L4]
is NULL)

AND ([T003-Snapshot
Table].[PortfolioID]=[Q010-Program-Tracker].[PortfolioID])
OR ([T003-Snapshot Table].[PortfolioID] is NULL AND
[Q010-Program-Tracker].[PortfolioID] is NULL)

AND ([T003-Snapshot
Table].[Application]=[Q010-Program-Tracker].[Application])
Or ([T003-Snapshot Table].[Application] is NULL AND
[Q010-Program-Tracker].[Application] is NULL)

AND ([T003-Snapshot Table].[PlanDate]=[Q010-Program-Tracker].[PlanDate])
OR ([T003-Snapshot Table].[PlanDate] IS NULL AND
[Q010-Program-Tracker].[PlanDate] is NULL)

AND ([T003-Snapshot Table].[CompDate]=[Q010-Program-Tracker].[CompDate])
OR ([T003-Snapshot Table].[CompDate] IS NULL AND
[Q010-Program-Tracker].[CompDate] IS NULL)

AND ([T003-Snapshot Table].[Mandated?]=[Q010-Program-Tracker].[Mandated?])
OR ([T003-Snapshot Table].[Mandated?] is NULL AND
[Q010-Program-Tracker].[Mandated?] IS NULL)

AND ([T003-Snapshot Table].[Completed?]=[Q010-Program-Tracker].[Completed?])
OR ([T003-Snapshot Table].[Completed?] is NULL AND
[Q010-Program-Tracker].[Completed?] IS NULL)

AND ([T003-Snapshot
Table].[EligibleCount]=[Q010-Program-Tracker].[EligibleCount])
OR ([T003-Snapshot Table].[EligibleCount] IS NULL AND
[Q010-Program-Tracker].[EligibleCount] IS NULL)

AND ([T003-Snapshot
Table].[ScorecardCount]=[Q010-Program-Tracker].[ScorecardCount])
OR ([T003-Snapshot Table].[ScorecardCount] IS NULL AND
[Q010-Program-Tracker].[ScorecardCount] IS NULL)

AND ([T003-Snapshot
Table].[CompleteCount]=[Q010-Program-Tracker].[CompleteCount])
OR ([T003-Snapshot Table].[CompleteCount] IS NULL AND
[Q010-Program-Tracker].[CompleteCount] IS NULL)
WHERE [Q010-Program-Tracker].UniqueKey Is Null;


John Spencer said:
You could try using a non-equi join that would look like

SELECT [T003-Snapshot Table].*
FROM [T003-Snapshot Table] LEFT JOIN [Q010-Program-Tracker]
ON [T003-Snapshot Table].[UniqueKey]=[Q010-Program-Tracker].[UniqueKey]
AND ([T003-Snapshot Table].[L2]=[Q010-Program-Tracker].[L2]
Or ([T003-Snapshot Table].[L2] Is NULL AND
[Q010-Program-Tracker].[L2])) Is Null
WHERE
[Q010-Program-Tracker].[UniqueKey] IS NULL;

Or try forcing a a known value (of the proper type) if the field is null.
That means you would have to be sure to force a value that did not exist in
the tables. Perhaps a zero-length string, a date in 1899, and a negative
number smaller than any one in the table(s).

SELECT .*
FROM [T003-Snapshot Table] AS S LEFT JOIN [Q010-Program-Tracker] AS P
ON .[UniqueKey=[P].[UniqueKey]
AND IIF(.[L2] Is Null,"",.[L2])= IIF([P].[L2] Is Null,"",P.L2)

WHERE
[Q010-Program-Tracker].[UniqueKey] IS NULL;



John Spencer
Access MVP 2002-2005, 2007-2009
The Hilltop Institute
University of Maryland Baltimore County
 
J

John Spencer

Well if the DATE field is always after Dec 30, 1899, you could plug that in.
If it was a DOB you could plug in a date 200 years ago for nulls.

Perhaps you are missing some parentheses to make sure things are working. I
would have parentheses around each set of fields like so:

AND (([T003-Snapshot Table].[L3]=[Q010-Program-Tracker].[L3])
Or ([T003-Snapshot Table].[L3] is Null AND [Q010-Program-Tracker].[L3]
is NULL))

I would try just joining on Unique Key and L3 field to see if things work as
expected. Then once I get things working, I would add one more field.

You could just join on unique key and then use a complex where clause.

SELECT "Field Changed" AS Reason, .*
FROM [T003-Snapshot Table] as S LEFT JOIN [Q010-Program-Tracker] as P ON
(.[UniqueKey]=[P].[UniqueKey])
WHERE S.L3 <> P.L3
or (S.L3 is Null and P.L3 is Not Null)
Or (S.L3 is Not Null and P.L3 Is Null)
Or S.L4 <> P.L4
or (S.L4 is Null and P.L4 is Not Null)
Or (S.L4 is Not Null and P.L4 Is Null)
OR P.UniqueKey is Null
Or S.PlanDate<> P.PlanDate
or (S.PlanDate is Null and P.PlanDate is Not Null)
Or (S.PlanDate is Not Null and P.PlanDate Is Null)
....

John Spencer
Access MVP 2002-2005, 2007-2009
The Hilltop Institute
University of Maryland Baltimore County
Well...I made a run at it, John. Definitely considered "plugging" values
into some of the fields, to eliminate the nulls, but, for example, on Date
fields, what do you plug (and folks make decisions based upon this file).
So, while I was tempted, that approach would have a different set of problems.

Here is where I ended up, querywise. I get strange results, and definitely
not what I am expecting. Ie, my test data where I changed 3 rows in the
snapshot table, are not showing up. Not sure why. DOes it look correctly
structured?

SELECT "Field Changed" AS Reason, [T003-Snapshot Table].*
FROM [T003-Snapshot Table] LEFT JOIN [Q010-Program-Tracker] ON
([T003-Snapshot Table].[UniqueKey]=[Q010-Program-Tracker].[UniqueKey])

AND ([T003-Snapshot Table].[L3]=[Q010-Program-Tracker].[L3])
Or ([T003-Snapshot Table].[L3] is Null AND [Q010-Program-Tracker].[L3]
is NULL)

AND ([T003-Snapshot Table].[L4]=[Q010-Program-Tracker].[L4])
OR ([T003-Snapshot Table].[L4] is NULL AND [Q010-Program-Tracker].[L4]
is NULL)

AND ([T003-Snapshot
Table].[PortfolioID]=[Q010-Program-Tracker].[PortfolioID])
OR ([T003-Snapshot Table].[PortfolioID] is NULL AND
[Q010-Program-Tracker].[PortfolioID] is NULL)

AND ([T003-Snapshot
Table].[Application]=[Q010-Program-Tracker].[Application])
Or ([T003-Snapshot Table].[Application] is NULL AND
[Q010-Program-Tracker].[Application] is NULL)

AND ([T003-Snapshot Table].[PlanDate]=[Q010-Program-Tracker].[PlanDate])
OR ([T003-Snapshot Table].[PlanDate] IS NULL AND
[Q010-Program-Tracker].[PlanDate] is NULL)

AND ([T003-Snapshot Table].[CompDate]=[Q010-Program-Tracker].[CompDate])
OR ([T003-Snapshot Table].[CompDate] IS NULL AND
[Q010-Program-Tracker].[CompDate] IS NULL)

AND ([T003-Snapshot Table].[Mandated?]=[Q010-Program-Tracker].[Mandated?])
OR ([T003-Snapshot Table].[Mandated?] is NULL AND
[Q010-Program-Tracker].[Mandated?] IS NULL)

AND ([T003-Snapshot Table].[Completed?]=[Q010-Program-Tracker].[Completed?])
OR ([T003-Snapshot Table].[Completed?] is NULL AND
[Q010-Program-Tracker].[Completed?] IS NULL)

AND ([T003-Snapshot
Table].[EligibleCount]=[Q010-Program-Tracker].[EligibleCount])
OR ([T003-Snapshot Table].[EligibleCount] IS NULL AND
[Q010-Program-Tracker].[EligibleCount] IS NULL)

AND ([T003-Snapshot
Table].[ScorecardCount]=[Q010-Program-Tracker].[ScorecardCount])
OR ([T003-Snapshot Table].[ScorecardCount] IS NULL AND
[Q010-Program-Tracker].[ScorecardCount] IS NULL)

AND ([T003-Snapshot
Table].[CompleteCount]=[Q010-Program-Tracker].[CompleteCount])
OR ([T003-Snapshot Table].[CompleteCount] IS NULL AND
[Q010-Program-Tracker].[CompleteCount] IS NULL)
WHERE [Q010-Program-Tracker].UniqueKey Is Null;


John Spencer said:
You could try using a non-equi join that would look like

SELECT [T003-Snapshot Table].*
FROM [T003-Snapshot Table] LEFT JOIN [Q010-Program-Tracker]
ON [T003-Snapshot Table].[UniqueKey]=[Q010-Program-Tracker].[UniqueKey]
AND ([T003-Snapshot Table].[L2]=[Q010-Program-Tracker].[L2]
Or ([T003-Snapshot Table].[L2] Is NULL AND
[Q010-Program-Tracker].[L2])) Is Null
WHERE
[Q010-Program-Tracker].[UniqueKey] IS NULL;

Or try forcing a a known value (of the proper type) if the field is null.
That means you would have to be sure to force a value that did not exist in
the tables. Perhaps a zero-length string, a date in 1899, and a negative
number smaller than any one in the table(s).

SELECT .*
FROM [T003-Snapshot Table] AS S LEFT JOIN [Q010-Program-Tracker] AS P
ON .[UniqueKey=[P].[UniqueKey]
AND IIF(.[L2] Is Null,"",.[L2])= IIF([P].[L2] Is Null,"",P.L2)

WHERE
[Q010-Program-Tracker].[UniqueKey] IS NULL;



John Spencer
Access MVP 2002-2005, 2007-2009
The Hilltop Institute
University of Maryland Baltimore County

 
P

PatK

That did the trick..musta been those darned parentheses. Thanks John!

Patk

John Spencer said:
Well if the DATE field is always after Dec 30, 1899, you could plug that in.
If it was a DOB you could plug in a date 200 years ago for nulls.

Perhaps you are missing some parentheses to make sure things are working. I
would have parentheses around each set of fields like so:

AND (([T003-Snapshot Table].[L3]=[Q010-Program-Tracker].[L3])
Or ([T003-Snapshot Table].[L3] is Null AND [Q010-Program-Tracker].[L3]
is NULL))

I would try just joining on Unique Key and L3 field to see if things work as
expected. Then once I get things working, I would add one more field.

You could just join on unique key and then use a complex where clause.

SELECT "Field Changed" AS Reason, .*
FROM [T003-Snapshot Table] as S LEFT JOIN [Q010-Program-Tracker] as P ON
(.[UniqueKey]=[P].[UniqueKey])
WHERE S.L3 <> P.L3
or (S.L3 is Null and P.L3 is Not Null)
Or (S.L3 is Not Null and P.L3 Is Null)
Or S.L4 <> P.L4
or (S.L4 is Null and P.L4 is Not Null)
Or (S.L4 is Not Null and P.L4 Is Null)
OR P.UniqueKey is Null
Or S.PlanDate<> P.PlanDate
or (S.PlanDate is Null and P.PlanDate is Not Null)
Or (S.PlanDate is Not Null and P.PlanDate Is Null)
....

John Spencer
Access MVP 2002-2005, 2007-2009
The Hilltop Institute
University of Maryland Baltimore County
Well...I made a run at it, John. Definitely considered "plugging" values
into some of the fields, to eliminate the nulls, but, for example, on Date
fields, what do you plug (and folks make decisions based upon this file).
So, while I was tempted, that approach would have a different set of problems.

Here is where I ended up, querywise. I get strange results, and definitely
not what I am expecting. Ie, my test data where I changed 3 rows in the
snapshot table, are not showing up. Not sure why. DOes it look correctly
structured?

SELECT "Field Changed" AS Reason, [T003-Snapshot Table].*
FROM [T003-Snapshot Table] LEFT JOIN [Q010-Program-Tracker] ON
([T003-Snapshot Table].[UniqueKey]=[Q010-Program-Tracker].[UniqueKey])

AND ([T003-Snapshot Table].[L3]=[Q010-Program-Tracker].[L3])
Or ([T003-Snapshot Table].[L3] is Null AND [Q010-Program-Tracker].[L3]
is NULL)

AND ([T003-Snapshot Table].[L4]=[Q010-Program-Tracker].[L4])
OR ([T003-Snapshot Table].[L4] is NULL AND [Q010-Program-Tracker].[L4]
is NULL)

AND ([T003-Snapshot
Table].[PortfolioID]=[Q010-Program-Tracker].[PortfolioID])
OR ([T003-Snapshot Table].[PortfolioID] is NULL AND
[Q010-Program-Tracker].[PortfolioID] is NULL)

AND ([T003-Snapshot
Table].[Application]=[Q010-Program-Tracker].[Application])
Or ([T003-Snapshot Table].[Application] is NULL AND
[Q010-Program-Tracker].[Application] is NULL)

AND ([T003-Snapshot Table].[PlanDate]=[Q010-Program-Tracker].[PlanDate])
OR ([T003-Snapshot Table].[PlanDate] IS NULL AND
[Q010-Program-Tracker].[PlanDate] is NULL)

AND ([T003-Snapshot Table].[CompDate]=[Q010-Program-Tracker].[CompDate])
OR ([T003-Snapshot Table].[CompDate] IS NULL AND
[Q010-Program-Tracker].[CompDate] IS NULL)

AND ([T003-Snapshot Table].[Mandated?]=[Q010-Program-Tracker].[Mandated?])
OR ([T003-Snapshot Table].[Mandated?] is NULL AND
[Q010-Program-Tracker].[Mandated?] IS NULL)

AND ([T003-Snapshot Table].[Completed?]=[Q010-Program-Tracker].[Completed?])
OR ([T003-Snapshot Table].[Completed?] is NULL AND
[Q010-Program-Tracker].[Completed?] IS NULL)

AND ([T003-Snapshot
Table].[EligibleCount]=[Q010-Program-Tracker].[EligibleCount])
OR ([T003-Snapshot Table].[EligibleCount] IS NULL AND
[Q010-Program-Tracker].[EligibleCount] IS NULL)

AND ([T003-Snapshot
Table].[ScorecardCount]=[Q010-Program-Tracker].[ScorecardCount])
OR ([T003-Snapshot Table].[ScorecardCount] IS NULL AND
[Q010-Program-Tracker].[ScorecardCount] IS NULL)

AND ([T003-Snapshot
Table].[CompleteCount]=[Q010-Program-Tracker].[CompleteCount])
OR ([T003-Snapshot Table].[CompleteCount] IS NULL AND
[Q010-Program-Tracker].[CompleteCount] IS NULL)
WHERE [Q010-Program-Tracker].UniqueKey Is Null;


John Spencer said:
You could try using a non-equi join that would look like

SELECT [T003-Snapshot Table].*
FROM [T003-Snapshot Table] LEFT JOIN [Q010-Program-Tracker]
ON [T003-Snapshot Table].[UniqueKey]=[Q010-Program-Tracker].[UniqueKey]
AND ([T003-Snapshot Table].[L2]=[Q010-Program-Tracker].[L2]
Or ([T003-Snapshot Table].[L2] Is NULL AND
[Q010-Program-Tracker].[L2])) Is Null
WHERE
[Q010-Program-Tracker].[UniqueKey] IS NULL;

Or try forcing a a known value (of the proper type) if the field is null.
That means you would have to be sure to force a value that did not exist in
the tables. Perhaps a zero-length string, a date in 1899, and a negative
number smaller than any one in the table(s).

SELECT .*
FROM [T003-Snapshot Table] AS S LEFT JOIN [Q010-Program-Tracker] AS P
ON .[UniqueKey=[P].[UniqueKey]
AND IIF(.[L2] Is Null,"",.[L2])= IIF([P].[L2] Is Null,"",P.L2)

WHERE
[Q010-Program-Tracker].[UniqueKey] IS NULL;



John Spencer
Access MVP 2002-2005, 2007-2009
The Hilltop Institute
University of Maryland Baltimore County


.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top