Routine with fuzzy logic to determine the relative comparison of two strings?

  • Thread starter Thread starter Elmer Smurdley
  • Start date Start date
E

Elmer Smurdley

Anyone have a canned subroutine that can determine how close a fit one
string is to another?

For example, if one string has one or more caps and the other doesn't

or one string has an "&" and the other has "and"

or one starts with "The" and the other doesn't

or one has a "2" and the other has "II"

etc.

Thanx in advance...
 
Those hardly constitute close fits, though, do they?

Sorry, I don't have anything like that.

Pete
 
Anyone have a canned subroutine that can determine how close a fit one
Hi. Not sure if this is what you want, but in Math, one option is the
"Levenshtein distance."

http://en.wikipedia.org/wiki/Levenshtein_distance

So, for example:

EditDistance["Kitten","sitting"]
3

EditDistance["Saturday","Sunday"]
3

EditDistance["Saturday","sunday",IgnoreCase->False]
4
 
Thanks for that, Dana - I'd never heard of it before. I wonder how one
would cope with the example the OP gave of one string containing 2 and
the other containing ii (or <space>ii<space>)?

Pete

Hi. Not sure if this is what you want, but in Math, one option is the
"Levenshtein distance."

http://en.wikipedia.org/wiki/Levenshtein_distance

So, for example:

EditDistance["Kitten","sitting"]
3

EditDistance["Saturday","Sunday"]
3

EditDistance["Saturday","sunday",IgnoreCase->False]
4

--
Dana DeLouis




Those hardly constitute close fits, though, do they?
Sorry, I don't have anything like that.

- Show quoted text -
 
...one string containing 2 and the other containing ii.

Hi. Not sure on that one.
Anyway, such algorithms are used in different ways.
One example would be a function that links to a dictionary and uses that
"distance" algorithm as a default.

NearWords = Nearest[DictionaryLookup[__]];
Hence:

NearWords["Excel",10]

{"excel","Axel","Edsel","Ethel","excels","Exocet","expel","Abel","ace","aced"}

--
Dana DeLouis


Pete_UK said:
Thanks for that, Dana - I'd never heard of it before. I wonder how one
would cope with the example the OP gave of one string containing 2 and
the other containing ii (or <space>ii<space>)?

Pete

Anyone have a canned subroutine that can determine how close a fit one
string is to another?

Hi. Not sure if this is what you want, but in Math, one option is the
"Levenshtein distance."

http://en.wikipedia.org/wiki/Levenshtein_distance

So, for example:

EditDistance["Kitten","sitting"]
3

EditDistance["Saturday","Sunday"]
3

EditDistance["Saturday","sunday",IgnoreCase->False]
4

--
Dana DeLouis




Those hardly constitute close fits, though, do they?
Sorry, I don't have anything like that.

Anyone have a canned subroutine that can determine how close a fit one
string is to another?
For example, if one string has one or more caps and the other doesn't
or one string has an "&" and the other has "and"
or one starts with "The" and the other doesn't
or one has a "2" and the other has "II"

Thanx in advance...- Hide quoted text -

- Show quoted text -
 
Thanx Dana.

Both of the algorithms mentioned are very close to what I need and a
very good starting point to resolving some of the other issues.


...one string containing 2 and the other containing ii.

Hi. Not sure on that one.
Anyway, such algorithms are used in different ways.
One example would be a function that links to a dictionary and uses that
"distance" algorithm as a default.

NearWords = Nearest[DictionaryLookup[__]];
Hence:

NearWords["Excel",10]

{"excel","Axel","Edsel","Ethel","excels","Exocet","expel","Abel","ace","aced"}
 
Don't know what happened to my earlier post, but this reponse was
right on the money.

Thanks, Dana
 
Back
Top