you can actually use the SOUNDEX function in SQL Server to sweep a lot
of inconsistencies like this under the carpet
On Dec 28, 7:35*pm, John W. Vinson
<jvinson@STOP_SPAM.WysardOfInfo.com> wrote:
> On Sun, 28 Dec 2008 18:06:00 -0800, Steve Gibbs
>
> <SteveGi...@discussions.microsoft.com> wrote:
> >I have a mailing list with approximately 90,000 records. *The records were
> >created from registrations, so I have a registration number and name and
> >address information. *Problem is one person may have registered more than
> >once so now is listed with 2 registration numbers. *I also would like to send
> >only one mailing to a household and more than one person may be registered. *
> >How can I best eliminate the duplicate addresses?
>
> With considerable difficulty, I fear. There are commercial "list cleaning"
> services, and they're expensive for good reason!
>
> Are
>
> Fred Brown, 123 3rd St., Podunk, OH
> Fred Brown, 123 3rd St., Podunk, OH
>
> the same? Nope, they're father and son and both want to be on the list; Fred
> Jr. lives in the separate entrance apartment in back.
>
> How about
>
> Sara Jones, 321 5th St., Anywhere IA
> Sarah Jones, 321 Fifth St., Anywhere City, IA
>
> No computer program will think so.
>
> Similarly, how about Bill Roberts and William Roberts? Same? Different? Hard
> to tell.
>
> You'll need a USB interface - Using Someone's Brain. A "Find Duplicates" query
> will get you started and eliminate the exact dups (including Fred Brown Jr.
> unfortunately), but then you'll need to manually go through lists sorted every
> which way to trim out the dups.
> --
>
> * * * * * * *John W. Vinson [MVP]
|