Duplicate Address Deletion

S

Steve Gibbs

I have a table of approximatly 60,000 records. The records were created from
product registrations. I need to create a mailing list, but the same person
may have registered more than one product, and more than one person in a
household may have registered a product. I only want to send one letter to
each address. I made a query with only addresses and then grouped them by
address, city, state. then I joined the table with my grouped query. I
ended up with more records than I started with. The grouped query looks
good. I am green at this, and must be doing the join incorrectly. Can/will
anyone walk me through this, please?
 
R

Roger Carlson

How can anyone tell you what you've done wrong if you don't tell us what
you've done? Can you give specifics: Table names, Field names, SQL of the
Join query, etc?
 
P

PvdG42

Steve Gibbs said:
I have a table of approximatly 60,000 records. The records were created
from
product registrations. I need to create a mailing list, but the same
person
may have registered more than one product, and more than one person in a
household may have registered a product. I only want to send one letter
to
each address. I made a query with only addresses and then grouped them by
address, city, state. then I joined the table with my grouped query. I
ended up with more records than I started with. The grouped query looks
good. I am green at this, and must be doing the join incorrectly.
Can/will
anyone walk me through this, please?

Look into SELECT DISTINCT. Here's a starting point. You can find many other
articles on MSDN.

http://msdn.microsoft.com/en-us/library/bb208930.aspx
 
M

Michael J. Strickland

Steve Gibbs said:
I have a table of approximatly 60,000 records. The records were
created from
product registrations. I need to create a mailing list, but the same
person
may have registered more than one product, and more than one person in
a
household may have registered a product. I only want to send one
letter to
each address. I made a query with only addresses and then grouped
them by
address, city, state. then I joined the table with my grouped query.
I
ended up with more records than I started with. The grouped query
looks
good. I am green at this, and must be doing the join incorrectly.
Can/will
anyone walk me through this, please?


Assume your original table is named ABC and the fields you want to use
to determine duplicates are LAST, ADDRESS, ZIP. I find these work best.

1. Copy your original table (ABC), (using the structure only option) to
a new table to create the empty table ABC_New.

2. Open ABC_New in table design mode. Select the 3 fields LAST, ADDRESS,
ZIP (Hold shift and click on each of these). Then click on the primary
key button to make these fields a primary key. Close design mode.

3. Open the original table (ABC) and select all records (click in upper
left corner). Hit Ctl-C to copy the records.

4. Minimize, but don't close this table.

5. Open the new table (ABC_New). Click the upper left corner and paste
the copied records into it (Ctl-V).

6. During the copy operation, you will receive an error message saying
something to the effect that all records could not be pasted. Answer Ok
to this message.
Another message will come up asking if you want to suppress further
error messages. Answer Yes to this message.

Notes:
After this operation the table "ABC_New" will contain the de-duped
records. The table "Paste Errors" will contain the duplicate records.
The original table (ABC) will remain unchanged.

If your table contains more than 64k records you may have to use an
append query to copy the records into the new table.

This is a quick and dirty de-dupe. It will treat records containing a
blank in any of the key fields (LAST, ADDRESS, ZIP) as duplicates since
a primary key cannot contain blanks. Thus it will treat
John Smith
123 Main St
New York, NY 10000

as a duplicate of:
John Smith
123 Main St
Philadelphia, PA
(missing zip)

To avoid this you may include the City & State in your primary key.


--
 
K

Ken Sheridan

I have a table of approximatly 60,000 records.  The records were created from
product registrations.  I need to create a mailing list, but the same person
may have registered more than one product, and more than one person in a
household may have registered a product.  I only want to send one letter to
each address.  I made a query with only addresses and then grouped themby
address, city, state.  then I joined the table with my grouped query.  I
ended up with more records than I started with.  The grouped query looks
good.  I am green at this, and must be doing the join incorrectly.  Can/will
anyone walk me through this, please?

To get a list of addresses only you can return distinct rows with a
query which includes columns for the address data only (address, city
and state in your case by the sounds of it). The query would be
something like this:

SELECT DISTINCT address, city, state
FROM YourTable;

In query design view you achieve the same thing by setting the query's
UniqueValues property to True (Yes in the properties sheet).

This won't include the addressee's names of course. One option would
be to include one name randomly from each address by grouping the
query on the address columns and returning the MAX name (any
aggregation operator will do in fact as its merely to get one name
arbitrarily for each address). If the name is in a single column,
e.g. customer, you can get the MAX value of that column per address
like so:

SELECT MAX(customer) AS Addressee,
address, city, state
FROM YourTable
GROUP BY address, city, state;

If the names are separated into FirstName and LastName columns (as
they should be) then you cannot independenly get the MAX values of
each as you'd end up in many cases with the wrong first and last name
paired. You have to concatenate each name into a single value and get
the MAX value of that:

SELECT MAX(TRIM(FirstName & " " & LastName)) AS Addressee,
address, city, state
FROM YourTable
GROUP BY address, city, state;

If you want to create return a single row which includes everybody at
that address then it gets more complex as you'd need to call a VBA
function which concatenates all the names into a single string,
separating each with something like a semi-colon or a carriage return/
line feed, e.g. using a semi-colon:

Public Function GetAddressees(strAddress As String) As String

Dim dbs As DAO.Database
Dim rst As DAO.Recordset
Dim strSQL As String
Dim strAddressees As String

strSQL = "SELECT DISTINCT TRIM(FirstName & " " & LastName) " & _
"FROM YourTable " & _
"WHERE Address & city & state = """ & strAddress & """"

Set dbs = CurrentDb
Set rst = dbs.OpenRecordset(strSQL)

With rst
Do While Not .EOF
strAddressees = strAddressees & "; " & _
TRIM(.Fields("FirstName") & " " & .Fields("LastName"))
.MoveNext
Loop
.Close
' remove leading semi colon and space
strAddressees = Mid$(strAddressees, 3)
End With

GetAddressees = strAddressees

End Function

If you want to insert a carriage return/line feed after each name
change it like so:

Do While Not .EOF
strAddressees = strAddressees & vbNewLine & _
TRIM(.Fields("FirstName") & " " & .Fields("LastName"))
.MoveNext
Loop

In the query you'd call the function like so:

SELECT DISTINCT
GetAddressees([address] & [city] & [state]) AS Addressees,
address, city, state
FROM YourTable;

Ken Sheridan
Stafford, England
 
S

Steve Gibbs

This solution worked great. I had not done an append Query before and it
took me several tries before I got it to work properly. Thanks!
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top