Fuzzy Search Against List Of Company Names?

P

PeteCresswell

I've got a client that does bond and equity trading on behalf of
various "funds".

Some of these funds are owned by groups that do not care to invest
certain companies.

Each of those groups supplies an explicit list of companies that they
do not want to invest in.

The traders don't have to make judgement calls. All they have to do
is check to see if a company is on the list before buying into it.

But if there are a lot of lists and/or some lists are very long - and
not always alphabetically
sequenced - it becomes a problem.

In addition, the name of the company that the trader wants to buy
might not be spelled/rendered quite the same as it might be on
somebody's list.

What they want is a quick/easy way to check the lists.

Something like:
--------------------------------------------------------
- Trader specifies which group they're buying for.

- Trader enters the name - or some fragment thereof - of the
company they're thinking about buying into.

- The application presents a list of forbidden companies -
hopefully less than a dozen - based on
some sort of fuzzy matching against the list.

- The trader eyeballs that short list to see if the company they're
about to trade
is on it.
---------------------------------------------------------

Before I run off and develop this as an MS Access application, I
thought I'd ask around to see
----------------------------------------
- If I'd be re-inventing the wheel

- What the best approach would be if/when I actually do it.
----------------------------------------

Seems like such a common situation that I'd be suprised if there
weren't a number of canned solutions out there.

Anybody have some thoughts on this?
 
J

John W. Vinson

In addition, the name of the company that the trader wants to buy
might not be spelled/rendered quite the same as it might be on
somebody's list.

Can't they provide the ticker or the CUSIP? That will at least be unique and
unambiguous. I'd hate to try to manage company names if the input is freeform
uncontrolled text input... there are just too many duplicate or near-duplicate
names!

John W. Vinson [MVP]
 
U

UpRider

Each of those groups supplies an explicit list of companies that they
do not want to invest in.

Tell the dimwits to use the Committee on Uniform Securities Identification
Procedures
CUSIP - that's what it's for.

UpRider
 
D

David Portwood

Perhaps a hierarchical categorization might help.

For instance, suppose the user wants to know if a particular company is on
the "proscribed" list. He looks at an opening list consisting of three very
distinct categories.

The user selects one of the three categories and then sees maybe a shortlist
of five subcategories, also very unambiguous, selects from these, and
finally pulls up a list of maybe nine or ten proscribed companies that the
user can quickly run his eye over.

Of course this depends on being able to unambiguously categorize your
companies. If the user has to scratch his head about which category a
company might belong to, it won't work too well.

Just a suggestion, FWIW.
 
A

AccessVandal via AccessMonster.com

Hi Pete,

Try looking from another angle.

Use the front end to upload the company table by a query that filters the
wanted list.
Or
Use the query to built a table into the front end with the query.
Or
Just simply use the query to filter in the form’s record source.

Assuming the existing table in the Server or Folder, the company table have a
column/field name, lets say “Reputation†and the input data is like “1â€, “2â€,
â€3â€â€¦. So on. “1†being the lowest reputation and “3†the higher reputation….
so on.

Hope it will get you moving along.
 
P

(PeteCresswell)

Per John W. Vinson:
I'd hate to try to manage company names if the input is freeform
uncontrolled text input.

You and me..... -)

But that's what we've got.

Ticker was the first thing that came to my mind - and they've got
that.... but for some reason it doesn't work for them. I leaned
on one of their team a little bit, but didn't get anywhere.

I think I'll put ticker and CUSIP on the search screen anyhow -
as drop down lists.

Right now, it looks like we'll just concoct some SQL on-the-fly
depending on what the user types into the freeform text search
box.

Bank America ==> LIKE *BANK* OR LIKE *AMERICA* OR LIKE
*BANKAMERICA"

"Bank Of America"? Haven't thought it through yet... but we'll
need to do something with more than two words..... maybe parse
them out and then concatenate them in every possible order.

The desired result is to give them more, rather than fewer
possible hits.


CUSIP, I've found tb dicey in prior applications - especially
with bonds where apparently the same bonds (i.e. the same issuer
name) have different CUSIPs depending on various properties.
 
B

bsmith59

Per John W. Vinson:


You and me..... -)

But that's what we've got.

Ticker was the first thing that came to my mind - and they've got
that.... but for some reason it doesn't work for them. I leaned
on one of their team a little bit, but didn't get anywhere.

I think I'll put ticker and CUSIP on the search screen anyhow -
as drop down lists.

Right now, it looks like we'll just concoct some SQL on-the-fly
depending on what the user types into the freeform text search
box.

Bank America ==> LIKE *BANK* OR LIKE *AMERICA* OR LIKE
*BANKAMERICA"

"Bank Of America"? Haven't thought it through yet... but we'll
need to do something with more than two words..... maybe parse
them out and then concatenate them in every possible order.

The desired result is to give them more, rather than fewer
possible hits.

CUSIP, I've found tb dicey in prior applications - especially
with bonds where apparently the same bonds (i.e. the same issuer
name) have different CUSIPs depending on various properties.

I'm not sure if this will work based on the use case you've provided,
but hopefully so. I had a similar problem recently with two huge
lists of customer names I needed to match. I created a function that
essentially walks backward through each name in a given set character
by character and stops when it hits a single match in the other
customer name set. If it gets to zero matches, it backs up to the
previous result set (i.e, if it went from two matches to zero, it goes
back to the two matches to present to the user). I made it to do bulk
processing, but the same logic could be applied to an on-the-fly
transaction as well I think. Performance might depend on where your
data sits. If this sounds helpful, let me know and I'll give you a
copy. I have to take out my data, make sure it's relatively usable
for you, that's why I'll need to know if it sounds like it would
work. I don't want to go to the trouble if it isn't. And if there
are others on this list that would like it, let me know and i'll just
post it for free. It was fun to create, and doesn't have to be for
company names, of course....any two sets of strings would work....

Cheers,
Brandon
http://accesspro.blogspot.com
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top