Find a word in a string

C

Chris Mahoney

Hi

I have a string containing a phrase, and I want to search for a
particular word and get its index. I've tried a couple of things
already:

InStr(phrase, word) - this doesn't work because it finds "subwords",
eg. If I search for "colour" in a phrase containing "colourful", it'll
still detect it.

Regex.IsMatch(phrase, "\b" & word & "\b") - this doesn't work because
it only returns True or False, and doesn't actually tell me where in
the phrase the word is.

Basically I seem to be looking for a "hybrid" solution; one that will
find a whole word, and return its index. What should I be doing?

Thanks
Chris
 
J

Jay Parzych

wouldn't you want to search for " colour "? with a space before and after?
this would miss if it was the first word plus words at the end of a sentence,
question etc

if phrase.indexof(" colour ") then
found = true
end if
 
R

rowe_newsgroups

Hi

I have a string containing a phrase, and I want to search for a
particular word and get its index. I've tried a couple of things
already:

InStr(phrase, word) - this doesn't work because it finds "subwords",
eg. If I search for "colour" in a phrase containing "colourful", it'll
still detect it.

Regex.IsMatch(phrase, "\b" & word & "\b") - this doesn't work because
it only returns True or False, and doesn't actually tell me where in
the phrase the word is.

Basically I seem to be looking for a "hybrid" solution; one that will
find a whole word, and return its index. What should I be doing?

Thanks
Chris

Why not use something like:

' Typed in message
Match m = Regex.Match(phrase, String.Format("\b{0}\b", word)).

Then you can access both the index of the string and it's length from
the match object.

Thanks,

Seth Rowe
 
C

Chris Mahoney

Why not use something like:

' Typed in message
Match m = Regex.Match(phrase, String.Format("\b{0}\b", word)).

Thanks Seth, that's got me started. As you can probably tell, I'm new
to Regex :)

Now I'm trying to figure out how to add a "start point" to the Regex
search. With InStr, I could do InStr(10, phrase, word) and it'd start
searching from the 10th position (ignoring anything prior). Is there a
way to accomplish this with Regex too?

Jay, I'd considered using a space but the word I'm searching for may
be at the front of the paragraph or surrounded by punctuation. Using
\b in a Regex appears to allow for this.

Chris
 
R

rowe_newsgroups

Thanks Seth, that's got me started. As you can probably tell, I'm new
to Regex :)

Now I'm trying to figure out how to add a "start point" to the Regex
search. With InStr, I could do InStr(10, phrase, word) and it'd start
searching from the 10th position (ignoring anything prior). Is there a
way to accomplish this with Regex too?

Jay, I'd considered using a space but the word I'm searching for may
be at the front of the paragraph or surrounded by punctuation. Using
\b in a Regex appears to allow for this.

Chris

I don't know of a way to define a start point for regex, expect if you
pass a substring of the phrase into the function - but then you'll
need to adjust the index to find the original position of the string.
You will also take a minor performance hit for creating the substring
(this won't matter unless you do it millions of times back to back).
Besides, regex runs extremely fast, so you probably won't notice the
difference between changing the start points.

Also, you may look at downloading expresso, it's a great tool for
building/evaluating regular expressions, and it is definitely helping
me out with my current project.

http://www.ultrapico.com/Expresso.htm

Thanks,

Seth Rowe
 
C

Chris Mahoney

I don't know of a way to define a start point for regex, expect if you
pass a substring of the phrase into the function - but then you'll
need to adjust the index to find the original position of the string.
You will also take a minor performance hit for creating the substring
(this won't matter unless you do it millions of times back to back).
Besides, regex runs extremely fast, so you probably won't notice the
difference between changing the start points.

It's not the cleanest code ever, but I've written a new InStr that
accepts a Regex and can take indices :)

Function RegexInStr(ByVal Start As Integer, ByVal String1 As String,
ByVal String2 As String) As Integer
If String1.Length < Start Then Return 0
Dim intCharsToRemove As Integer = Start - 1
Dim strToTest As String = Strings.Right(String1, String1.Length -
intCharsToRemove)
Dim intIndex As Integer =
System.Text.RegularExpressions.Regex.Match(strToTest, String2).Index +
1
If intIndex = 1 And Strings.Left(strToTest, String2.Length) <>
String2 Then Return 0
Return intIndex + intCharsToRemove
End Function

My problem is now solved, and I thought I'd just post my code here in
case someone else runs into a similar problem.

Thanks for all your help!
Chris
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top