Best way to unduplicate records

R

Rocky

I'd love some advise on what direction to go with this...
Two tables from two different systems. Basically customer data.
All fields are named different but contain 80% similar data such as
tblA - Name, Zip, Race etc.
tbl-B - CustName, CustZip, CustRace

There should be some duplicate customers in B but there is indeed unique
customers as well.

I need to end up with one list of all uniqe customers. There is no common
CustomerID so we'll have to be happy with Lname,Fname comparison.

Any recommendations would be helpful. I can think of multiple ways to
approach this but wanted to hear some expert opinions. Thanks!
 
J

John W. Vinson

I'd love some advise on what direction to go with this...
Two tables from two different systems. Basically customer data.
All fields are named different but contain 80% similar data such as
tblA - Name, Zip, Race etc.
tbl-B - CustName, CustZip, CustRace

There should be some duplicate customers in B but there is indeed unique
customers as well.

I need to end up with one list of all uniqe customers. There is no common
CustomerID so we'll have to be happy with Lname,Fname comparison.

Any recommendations would be helpful. I can think of multiple ways to
approach this but wanted to hear some expert opinions. Thanks!

A UNION query will do this, and will discard *exact* duplicates in the
process:

SELECT tblA.[Name], tblA.Zip, tblA.Zip
FROM tblA
UNION
SELECT tblB.CustName, tblB.CustZip, tblB.CustRace
FROM tblB

"Close only counts in horseshoes" though - if tblA contains "Bill Jones" and
tblB contains "William Jones" or even "Bill Jones" (with two blanks not one)
it will not be seen as a duplicate.

You can base an Append query on this UNION query to populate a new table,
which will require some cleaning up.
 
J

John Spencer

You could try a union query

SELECT Name
FROM tblA
UNION
SELECT CustName
FROM TblB

That should give you a list of unique name values from the two tables.

If you want to add more fields in the UNION query you can but you will get
unique rows. So if you add the Zip code fields to the above and John Spencer
in Table A has a ZIP of 22121 and John Spencer in Table B has a Zip of 22222,
you will get two rows returned for John Spencer.

John Spencer
Access MVP 2002-2005, 2007-2010
The Hilltop Institute
University of Maryland Baltimore County
 
D

Daryl S

Rocky -

First back up your database before you do anything else.

Then choose which table you would like to keep ALL records from. Set the
Name/CustName field to be the primary key (since that is what you will be
comparing against). If you have separate first and last names, that would be
better - use both in the primary key. Then append the records from the
other table into the one with the primary key. Any records with the same
Name (or Last Name and First Name) will fail to insert, but any records with
unique names will be inserted.

The data still warrants review and cleanup. You can run queries against the
two tables to find other records. Maybe you have two Jane Doe records with
different zip codes? In order to add both, you might need to change the
primary key to name and zip code. It all depends on your data...
 
R

Rocky

I like the idea of this UNION idea but indeed if I use * then it appears that
EVERY SINGLE field is compared, rather than simply the name.

Is there a way to compare the name ONLY and still combine all the data? (I'm
not sure I'm saying that properly.)

Thanks if you have anymore insight.
 
J

John Spencer

Step 1

Well if you don't care which record is returned out of the group of matches

SELECT *
FROM TableA
UNION ALL
SELECT *
FROM TableB

Save that and then use an aggregate (TOTALS) query to return one record per name.

SELECT Name, First(Zip), First(Race), First(City)
FROM SavedUnionQuery
GROUP BY Name

YOU will HAVE TO LIST THE FIELDS in the Group by query and IF the structures
of tableA and TableB are not parallel, you will have to list the fields in the
Union query.

John Spencer
Access MVP 2002-2005, 2007-2010
The Hilltop Institute
University of Maryland Baltimore County
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top