Combining fields with similar initial characters

C

carl jordan

I have a database of about 55,000 company names. Different people at
different times and places made these entries. Some have a comma before the
"inc.," some do not. Some have periods between the letters LLC and others to
not; and some of those may or many not have the comma. Of course, similar
ones will constitute the same record. And there are other minor variations
as well. However, they create multiple records for what is in fact the same
company. I know from manually spot checking the list that they are the same
company.

My problem: I want to create a query that will count or combine all of
those that start alike, that will ignore, so to speak, the minor variations
at the end. I can do that individually, by company, if I want to, but I know
I will miss too many of them by having to look at the entire list manually.
For instance, I can run a search for "abc*" and get all of the abc companies,
regardless how they end. But I might miss the "lmn" companies, only because
I had not noticed that there were any.

How can I create a string in my query under "name" (in my db) where similar
names, regardless of how they start, will be joined into a new table?

Is that even possible?

Thanks in advance.
 
E

Evi

You could use
Left$([YourTextField],3)
to get the first 3 characters on their own and do a Find Duplicates query on
that

BTW if your field is really called Name you will have problems at some
stage. It's a reserved word.

Evi
 
C

carl jordan

Thank you. I think that will do the trick.

I can have 85-100 of one name similar to about 20 other similar names, and
so on. Choosing the right number does the trick. I think 15 is the one that
does the best for me.

Carl Jordan

Evi said:
You could use
Left$([YourTextField],3)
to get the first 3 characters on their own and do a Find Duplicates query on
that

BTW if your field is really called Name you will have problems at some
stage. It's a reserved word.

Evi

carl jordan said:
I have a database of about 55,000 company names. Different people at
different times and places made these entries. Some have a comma before the
"inc.," some do not. Some have periods between the letters LLC and others to
not; and some of those may or many not have the comma. Of course, similar
ones will constitute the same record. And there are other minor variations
as well. However, they create multiple records for what is in fact the same
company. I know from manually spot checking the list that they are the same
company.

My problem: I want to create a query that will count or combine all of
those that start alike, that will ignore, so to speak, the minor variations
at the end. I can do that individually, by company, if I want to, but I know
I will miss too many of them by having to look at the entire list manually.
For instance, I can run a search for "abc*" and get all of the abc companies,
regardless how they end. But I might miss the "lmn" companies, only because
I had not noticed that there were any.

How can I create a string in my query under "name" (in my db) where similar
names, regardless of how they start, will be joined into a new table?

Is that even possible?

Thanks in advance.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top