K
Karch
I need to find the fastest way in terms of storage and searching to
determine if a given string contains one of a member of a list of strings.
So, think of it in terms of this: I have a string such as the following:
"Smith said the dog belongs (or did belong) to the secretary, an employee of
the company."
Then the list contains the following (and this list is MUCH larger in the
real situation):
Adams
Alan
Jones
Jacobs
Smith
Thaxton
Vela
Zulu
I would need to stop the processing and return (true) as soon as Smith was
found. On the other hand, if the string was changed to the following, there
would be no match and I would return (false):
"Smitherson said the dog belongs (or did belong) to the secretary, an
employee of the company."
In the given string, do know that the matches should begin at a given point
(zero position), but I need to keep processing until I have exhausted the
candidate string in the list - as shown above - to prevent a false match.
I have played around with some different data structures, such as prefix and
suffix trees, an these work in the case that you have a string that you are
trying to match in a list, not vice versa. The approach is required to be
very performant because it will be evaluated millions of times. I am also
okay with an unsafe code approach that works. I just need the evaluations to
terminate as soon as possible rather than looping through every single item
in the list. Even an IndexOf is too slow.
determine if a given string contains one of a member of a list of strings.
So, think of it in terms of this: I have a string such as the following:
"Smith said the dog belongs (or did belong) to the secretary, an employee of
the company."
Then the list contains the following (and this list is MUCH larger in the
real situation):
Adams
Alan
Jones
Jacobs
Smith
Thaxton
Vela
Zulu
I would need to stop the processing and return (true) as soon as Smith was
found. On the other hand, if the string was changed to the following, there
would be no match and I would return (false):
"Smitherson said the dog belongs (or did belong) to the secretary, an
employee of the company."
In the given string, do know that the matches should begin at a given point
(zero position), but I need to keep processing until I have exhausted the
candidate string in the list - as shown above - to prevent a false match.
I have played around with some different data structures, such as prefix and
suffix trees, an these work in the case that you have a string that you are
trying to match in a list, not vice versa. The approach is required to be
very performant because it will be evaluated millions of times. I am also
okay with an unsafe code approach that works. I just need the evaluations to
terminate as soon as possible rather than looping through every single item
in the list. Even an IndexOf is too slow.