finding duplicates (and not from 1st letter)

  • Thread starter greeknl via AccessMonster.com
  • Start date
G

greeknl via AccessMonster.com

I have a duplicates query like this

SELECT Companies.CompaniesID, Companies.CompanyName
FROM Companies
WHERE (((Left([Companies].[CompanyName],5)) In (SELECT left([CompanyName],5)
FROM [Companies] As Tmp GROUP BY left(CompanyName,5) HAVING Count(*)>1 )))
ORDER BY Companies.CompanyName;

works fine. but I would like to know if it is possible to do the same but
only starting from the second, third on nth place inside the name.
This so it will find e.g. the following duplicates, or mistypes which have
been entered again

Catherine
Katherine

Papadopoulos
pepadopoulos

Thank you
 
J

John W. Vinson

works fine. but I would like to know if it is possible to do the same but
only starting from the second, third on nth place inside the name.
This so it will find e.g. the following duplicates, or mistypes which have
been entered again

Catherine
Katherine

Papadopoulos
pepadopoulos

You can use the Mid() function instead of the Left() function, e.g.

SELECT Companies.CompaniesID, Companies.CompanyName
FROM Companies
WHERE (((Mid([Companies].[CompanyName],2,5)) In (SELECT Mid([CompanyName],2,5)
FROM [Companies] As Tmp GROUP BY Mid(CompanyName,2,5) HAVING Count(*)>1 )))
ORDER BY Companies.CompanyName;

This will match Microsoft to Picrosulfur (matching the five characters
starting with the second).

Matching "almost" matches, with one letter misspelled (or ommitted or
inserted) is excessively difficult though. You might want to Google for
"SOUNDEX" - this will generate a text string which approximates the
pronunciation of English words, and will match your examples. However it will
also find that "Sheep" and "Soap" are the same...

John W. Vinson [MVP]
 
G

greeknl via AccessMonster.com

Thank you for this prompt reply.
I think I will skip the "SOUNDEX" as half my database is in greek
Thanks again
works fine. but I would like to know if it is possible to do the same but
only starting from the second, third on nth place inside the name.
[quoted text clipped - 6 lines]
Papadopoulos
pepadopoulos

You can use the Mid() function instead of the Left() function, e.g.

SELECT Companies.CompaniesID, Companies.CompanyName
FROM Companies
WHERE (((Mid([Companies].[CompanyName],2,5)) In (SELECT Mid([CompanyName],2,5)
FROM [Companies] As Tmp GROUP BY Mid(CompanyName,2,5) HAVING Count(*)>1 )))
ORDER BY Companies.CompanyName;

This will match Microsoft to Picrosulfur (matching the five characters
starting with the second).

Matching "almost" matches, with one letter misspelled (or ommitted or
inserted) is excessively difficult though. You might want to Google for
"SOUNDEX" - this will generate a text string which approximates the
pronunciation of English words, and will match your examples. However it will
also find that "Sheep" and "Soap" are the same...

John W. Vinson [MVP]
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top