duplicates

  • Thread starter Thread starter Jesse
  • Start date Start date
J

Jesse

hello,

when i create a query looking for duplicates, it is looking for
duplicates of exact values. here's an example.
"Metal Industries"
"Metal Industries Inc."

those to me are duplicates, but access is not pointing those out. it
will only point out when two entries are exactly the same. could
anybody help me?

jesse
 
There's no programatically do this since there can be many combinations. You
have to do it multiple times.

You can chose let say if the first five letters are the same, then
considered as duplicates.

Say, you have a query with left functions: left(company,5). then access will
know that
abcdef
abcdef inc.

are duplicates since the first 5 letters are the same.

but what if there are similar companies:
abc
abc inc.

this one access won't tell because the company names on 3 letters instead of
5. This one, you have to create another query with 3 letters limit:
left(company,3).

Thus, if you want to clean your database, you have to do this multiple
times, one by one, see if there are more combinations in the data.

Hope this helps.
 
Thank you. How do I enter in "left(company,5)" in design view. I'm
assuming you have to enter it in design view, as opposed to using the
wizard. I apologize, I am new at this. Thank you.

Jesse
 
Jesse said:
Thank you. How do I enter in "left(company,5)"
in design view. I'm assuming you have to enter it
in design view, as opposed to using the
wizard.

If you have two tables, CompaniesX, where you suspect there will not be any
extra words (though how you'd guess on a full-table basis, I wouldn't know),
and CompaniesY, where you suspect there will be extra words or characters.
Then, in the Criteria line of the corresponding Field "Company" dragged down
from the CompanyX table, use

Like Left([CompaniesY.Company],5) & "*"

But please note, this will not match any Records where the extra information
is in Table CompaniesX, nor will it match on those where the name in
CompanyX is less than 5 characters long... e.g. "Rave" in CompaniesX and
"Rave On" in CompaniesY will not match.

You could perhaps experiment with more complex expressions to improve the
matching. But there are many possible combinations... in the example above,
suppose the matching Company Fields were "The Rave" and "Rave On", which
would need some careful additional analysis.
I apologize, I am new at this.

We were all new at this stuff, sometime. {For some of us, that was a _long
time ago_, though. :-)}
Thank you.

You are welcome.

Larry Linson
Microsoft Access MVP
 
For the special case where either table may have a shorter version of
the name in the other (e.g. one has
Acme Steel
and the other
Acme Steelworks Inc
or vice versa), one can use InStr() to check whether one value is
contained in the other. In the query design grid, this would be two
criteria in two "Or" rows:

Instr(CompaniesX.Company, CompaniesY.Company) > 0
InStr(CompaniesY.Company, CompaniesX.Company) > 0


Jesse said:
Thank you. How do I enter in "left(company,5)"
in design view. I'm assuming you have to enter it
in design view, as opposed to using the
wizard.

If you have two tables, CompaniesX, where you suspect there will not be any
extra words (though how you'd guess on a full-table basis, I wouldn't know),
and CompaniesY, where you suspect there will be extra words or characters.
Then, in the Criteria line of the corresponding Field "Company" dragged down
from the CompanyX table, use

Like Left([CompaniesY.Company],5) & "*"

But please note, this will not match any Records where the extra information
is in Table CompaniesX, nor will it match on those where the name in
CompanyX is less than 5 characters long... e.g. "Rave" in CompaniesX and
"Rave On" in CompaniesY will not match.

You could perhaps experiment with more complex expressions to improve the
matching. But there are many possible combinations... in the example above,
suppose the matching Company Fields were "The Rave" and "Rave On", which
would need some careful additional analysis.
I apologize, I am new at this.

We were all new at this stuff, sometime. {For some of us, that was a _long
time ago_, though. :-)}
Thank you.

You are welcome.

Larry Linson
Microsoft Access MVP
 
I am not using two tables though. I have one table, where I am
checking one column (ie. "A1") for duplicate values. As mentioned in
above post, I am just concerned that values such as "Border Inc." and
"Border" will not be recognized as duplicates.
 
I tried using Left([company],3) in the criteria... (which i
substituted into the criteria of the duplicate query which i created
from the wizard). Here is the criteria line.

In (SELECT Left([company],5) FROM [Addresses] As Tmp GROUP BY [F1]
HAVING Count(*)>1 )

It is not functioning correctly. So here is the original criteria
line as the wizard set up for me...

In (SELECT [F1] FROM [Addresses] As Tmp GROUP BY [F1] HAVING
Count(*)>1 )

How do I modify this criteria line to detect two companies such as
"Border" and "Border Inc." as duplicates?
 
I am not using two tables though. I have one table, where I am
checking one column (ie. "A1") for duplicate values. As mentioned in
above post, I am just concerned that values such as "Border Inc." and
"Border" will not be recognized as duplicates.

Here are a couple of starting points. This one finds pairs of records
where the value of A1 in one record is contained in the value of A1 in
another record:

SELECT A.*
FROM MyTable AS A, MyTable AS B
WHERE A.ID <> B.ID
AND (
Instr(A.A1, B.A1) > 0
OR
InStr(B.A1, A.A1) > 0
)
;

This is a standard "find duplicates" query/subquery modified to look
only at the first five characters of the field:

SELECT *
FROM MyTable
WHERE ((Left(MyTable.A1, 5) IN (
SELECT Left(A1, 5)
FROM MyTable As Tmp
GROUP BY Left(A1, 5)
HAVING Count(*)>1 ))
)
ORDER BY MyTable.A1;
 
Try this:
make sure table name is 'address' and the field is 'company'

SELECT Left([company],5) As TMP, Count([company]) as Dups
FROM [Addresses]
GROUP BY Left([company],5)
HAVING Count([company])>1
 
Thanks John. I created a Duplicate query using the wizard, and
substituted your second method into the SELECT line, leaving the other
lines as the wizard created them. This is the full query code that
worked for me:

SELECT Addresses.[Company Name], Addresses.[Street Address or Box],
Addresses.[City, PR], Addresses.[Postal Code], Addresses.[Category]
FROM Addresses
WHERE ((Left([Addresses].[Company Name],3) IN (SELECT Left([Company
Name],3) FROM [Addresses] As Tmp GROUP BY Left([Company Name],3)
HAVING Count(*)>1 )))
ORDER BY Addresses.[Company Name];

What coding language is all of this anyway? Is this MySQL? Or are
they just similar?

Anyways, thank you ALL for your help.
Take care.

Jesse
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Back
Top