String matching (Inteligent)

  • Thread starter Thread starter anderskj1
  • Start date Start date
A

anderskj1

Hey

I need a string comparison algorithm which can comapre 2 strings and come
with a number, eg %, which indicates how much the string a alike.

I could of course just comapre the number of occurences of each letter, but
then aab woudl be equal aba. This is not good enough. So the algorithm also
has to compare closeness og letters etc.

I hope i can find a "finished" or skeleton algorithm since i am convicnced
that other, which have the time, could implement this much better than me.

Optimally i want something like this:
double closeness = SuperComparere("My string", "My strong"); // returns eg.
93%

Anyone heard about above and know sources for solutions?

Kindly Regards
Anders
 
I need a string comparison algorithm which can comapre 2 strings and come
with a number, eg %, which indicates how much the string a alike.

I could of course just comapre the number of occurences of each letter, but
then aab woudl be equal aba. This is not good enough. So the algorithm also
has to compare closeness og letters etc.

I hope i can find a "finished" or skeleton algorithm since i am convicnced
that other, which have the time, could implement this much better than me.

Optimally i want something like this:
double closeness = SuperComparere("My string", "My strong"); // returns eg.
93%

Anyone heard about above and know sources for solutions?

I haven't done anything similar myself, but
http://en.wikipedia.org/wiki/Fuzzy_string_searching
might be useful to you.
 
I need a string comparison algorithm which can comapre 2 strings and come
with a number, eg %, which indicates how much the string a alike.

What you are referring to is a rather difficult topic. I think (but am
not sure) that many SQL dbms systems provide something alike, but they
need to index the searchable data every once in a while. (PS: found it
at wikipedia,
http://en.wikipedia.org/wiki/SQL_Server_Full_Text_Search).

Anyways, the only thing I can really contribute to this thread is that
I think you should search for some code (.NET or 3rd party) that does
this, as it's a difficult task. Perhaps the Regex libraries provide
some help.

Good luck. Let us know what you chose to do.
 
Hi

Take a look on Google for Levenstein (think that's spelt correctly)
Distance Calculation - from memory it sounds remarkably close to what
you're after

Cheers
Martin
 
Back
Top