Help with highlighting text using RegEx

S

Samuel Hon

Hi

I've been tearing my hair out with this problem, I'm not particuarly
good with more complex reg expressions. Basically, I'm trying to
highlight search keywords.

For example:

http://www.google.co.uk/search?q=regular+expressions

If the phrase regular expressions appears, then the text is bold. This
I can achieve.

BUT

and there had to be one, if a word appears in a URL, then I need to
ignore it. One of the results is www.regular-expressions.info (which I
have looked at). Using the simple expression will change the url to

www.<B>regular</B>-<B>expressions</B>.info

which obviously is bad.

Any help would be appreciated

Thanks

Sam
 
N

Niki Estner

I think you need a negative lookbehind, something like:

"(?<!www\.\S*)regular"

This will match the word "regular" if it's not prefixed by a "www." followed
by any number of non-whitespace characters. (not sure if this is enough, but
you should be able to fine-tune it)
In general a negative lookbehind "(?<!...)" will prevent a match if the
"..." part can be matched left of the "real" match.

Hope this helps

Niki
 
S

Samuel Hon

Thanks Niki

Unfortunately I cant search for http:// or www because the search
value could be in the querystring or in the page name. I've been
trying to look for href= and then find the closing >

I'll have a fiddle with your suggestion.

Thanks again
 
S

Samuel Hon

I have a slightly different approach now which is going wrong because
the MatchCollection has 8 values (correct) which are blank (incorrect)

Can anyone see what i'm blatantly missing?

Ta in advance


Dim strText As String = "This is my long string with a <a
href=""http://localhost/test.aspx"">Test Link using quotes</a>,<BR>a
<a href = 'http://localhost/test.aspx'>Test Link using apostrophes</a>
<BR> and one with a <A
HREF=""http://localhost/test.aspx?test"">querystring</a>."

public Function Highlight(strText As String)

Dim objRegEx As New RegEx("<[^<>]*>", RegexOptions.IgnoreCase)

'Find All tags and place in collection
Dim colM As MatchCollection = objRegEx.Matches(strText)

'Replace all tags with $!$
Dim strReplaced As String = objRegEx.Replace(strText,"$!$")

'Create new objRegEx looking for test
objRegEx = New RegEx("(test)", RegexOptions.IgnoreCase)

'Replace All with highlight
strText = objRegEx.Replace(strReplaced, _
"<span style='background-color:#FFFF66;'>$1</span>")

'Loop through all items in the collection replacing $!$
For i As Integer = 0 to colM.Count -1
Dim m As Match = colM.Item(i)
strText = strText.Replace("$!$",m.ToString()) & vbcrlf
Next i

'Return
Return strText

End Function
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top