[Regular Expression] match a word with interpunctuation

T

teo

[Regular Expression] match word with interpunctuation

Hallo,
I need to build a regular expression
able to match a specified word,
preceded and followed by few chars (mainly interpunctuation)
Below the code.

------------------------

Explanation:

let's assume the word is
baby
these are the 27 cases I need to match:



baby baby is simply standing alone (obviously)



and also these cases:



baby.
baby,
baby;
baby:
baby!
baby?
baby baby with a following blank space
baby-
baby"
baby'
baby\
baby/
baby)



and also these cases:


..Baby
,baby
;baby
:baby
!baby
?baby
baby baby with a preceding blank space
-baby
"baby
'baby
\baby
/baby
(baby



and also
the resulting regexp
must exclude the following cases:


babylon baby followed by any char
lonbaby baby preceded by any char
baby678 baby followed by any number
678baby baby precede by any number



-------------------

This is the code:
'I have to count all the the occurences in a long text.
'I tried this code, but it didn't work:



Dim myCounter As Integer = 0

Dim myMatch As Match = Nothing

Dim myLongText As String = "This morning ...blah...blah.... goodnight to
everyone"

Dim myPattern As String = "[.;:/,-"!\?'( ]" & "baby" & "[.;:/,-"!\?') ]"




myMatch = Regex.Match(myLongText, myPattern, RegexOptions.Multiline Or
RegexOptions.IgnoreCase)


Do While myMatch.Success
myCounter = myCounter + 1
myMatch = myMatch.NextMatch
Loop



Debug.Write (myCounter.tostring)
 
C

Chris Chilvers

This is the code:
'I have to count all the the occurences in a long text.
'I tried this code, but it didn't work:



Dim myCounter As Integer = 0

Dim myMatch As Match = Nothing

Dim myLongText As String = "This morning ...blah...blah.... goodnight to
everyone"

Dim myPattern As String = "[.;:/,-"!\?'( ]" & "baby" & "[.;:/,-"!\?') ]"

Does it specificaly have to be that punctuation or can it be any
non-alphanumeric character?

If so, some thing like:
"\b" & word & "\b" should work.

See "regular expressions, atomic zero-width assertions" in the help.
 
T

teo

Dim myPattern As String = "[.;:/,-"!\?'( ]" & "baby" & "[.;:/,-"!\?') ]"

Does it specificaly have to be that punctuation or can it be any
non-alphanumeric character?

If so, some thing like:
"\b" & word & "\b" should work.

I specificaly have to match that punctuaction.
It seems I have solved it
with this regexp:

[^\w] [\\\,;:!&?("'./- ] ? baby [\\\,;:!&?)"'/.- ] ? [^\w]

----------------------------

I have another question:

I want to extract a given sequence,

the sequence occurs many times,

but I'm satisfied just
when I've found the very first one,

so to save computational time,
I'd like to tell to the RegExp to "stop and exit immediately" the search,
and give me the sequence,
I repeat in order to save time.

Is there any command to append to the RegExp?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top