RegExp

A

Arshavir Grigorian

Hi,

I am using Microsoft VBScript RegExp 5.5 within VBA for Excel.

http://msdn2.microsoft.com/en-us/library/f97kw5ka(VS.85).aspx

"Matches pattern and remembers the match. The matched substring can be
retrieved from the resulting Matches collection, using Item [0]...[n].
To match parentheses characters ( ), use "\(" or "\)"." - from the
docs on the above page.

I am wondering if anyone could give me an example of using the Item
array to retrieve clustered matches. Like so:

regEx.Pattern = ".*(\d{1,2})Y.*"

I need to match one or two digits followed by 'Y', but I only need to
get the digits. In regEx.Replace() I would use $1 to access the first
cluster (the match inside the innermost set of parens), but I am not
sure how to do it with regEx.Execute().

Thanks.
 
R

Ron Rosenfeld

Hi,

I am using Microsoft VBScript RegExp 5.5 within VBA for Excel.

http://msdn2.microsoft.com/en-us/library/f97kw5ka(VS.85).aspx

"Matches pattern and remembers the match. The matched substring can be
retrieved from the resulting Matches collection, using Item [0]...[n].
To match parentheses characters ( ), use "\(" or "\)"." - from the
docs on the above page.

I am wondering if anyone could give me an example of using the Item
array to retrieve clustered matches. Like so:

regEx.Pattern = ".*(\d{1,2})Y.*"

I need to match one or two digits followed by 'Y', but I only need to
get the digits. In regEx.Replace() I would use $1 to access the first
cluster (the match inside the innermost set of parens), but I am not
sure how to do it with regEx.Execute().

Thanks.


Something like this is one method that will return your group 1. Note the
SubMatches count is one-based; but the submatches collection is zero-based.

========================
Dim ResultString, myMatches as MatchCollection, myMatch As Match
Dim myRegExp As RegExp
Set myRegExp = New RegExp
myRegExp.Pattern = ".*(\d{1,2})Y.*"
Set myMatches = myRegExp.Execute(SubjectString)
If myMatches.Count >= 1 Then
Set myMatch = myMatches(0)
If myMatch.SubMatches.Count >= 1 Then
ResultString = myMatch.SubMatches(1-1)
Else
ResultString = ""
End If
Else
ResultString = ""
End If
==========================

However, your regex will not do what you want. It will only ever capture, into
group 1, a single digit followed by the Y.

See if you can figure out why, or look below for explanation:















Look up the difference between greedy and lazy quantifiers.













The first part of your regex ".*" says:

..*

Match any single character that is not a line break character «.*»
Between zero and unlimited times, as many times as possible, giving back as
needed (greedy) «*»

So, matching as much as it can, that will include all the digits except for the
single digit preceding the Y.


While you could certainly correct this by making the quantifier lazy (e.g.
".*?"), there's really no need for that at all.

You could accomplish your stated goal of capturing one or two digits, which are
followed by a Y, with the simpler regex:

"(\d{1,2})Y"

If you might have more than one such construct in a line, then with

Regex.global = true

myMatches.Count will give you the number of times that pattern was present in
the line.
--ron
 
A

Arshavir Grigorian

Thanks, Ron. That's some elaborate code.

On a related note, the following regex is intended to capture "IN (23,
3454, 354)" or "IN (?)". However, it only captures "IN (23, 3454, 354"
in the first case and "?)" in the second case. Any ideas why?

"IN \(((\w+,?\s*)+)|(\?)\)"

I am accessing the match through

Set myMatch = myMatches(0)
MsgBox ("value: " & myMatch.Value)

By the way, "(IN \(\?\))|(IN \((\w\s*,?\s*)+\))" seems to work fine.





I am using Microsoft VBScript RegExp 5.5 within VBA for Excel.

"Matches pattern and remembers the match. The matched substring can be
retrieved from the resulting Matches collection, using Item [0]...[n].
To match parentheses characters ( ), use "\(" or "\)"." - from the
docs on the above page.
I am wondering if anyone could give me an example of using the Item
array to retrieve clustered matches. Like so:
regEx.Pattern = ".*(\d{1,2})Y.*"
I need to match one or two digits followed by 'Y', but I only need to
get the digits. In regEx.Replace() I would use $1 to access the first
cluster (the match inside the innermost set of parens), but I am not
sure how to do it with regEx.Execute().

Something like this is one method that will return your group 1.  Note the
SubMatches count is one-based; but the submatches collection is zero-based..

========================
Dim ResultString, myMatches as MatchCollection, myMatch As Match
Dim myRegExp As RegExp
Set myRegExp = New RegExp
myRegExp.Pattern = ".*(\d{1,2})Y.*"
Set myMatches = myRegExp.Execute(SubjectString)
If myMatches.Count >= 1 Then
        Set myMatch = myMatches(0)
        If myMatch.SubMatches.Count >= 1 Then
                ResultString = myMatch.SubMatches(1-1)
        Else
                ResultString = ""
        End If
Else
        ResultString = ""
End If
==========================

However, your regex will not do what you want.  It will only ever capture, into
group 1, a single digit followed by the Y.

See if you can figure out why, or look below for explanation:

Look up the difference between greedy and lazy quantifiers.

The first part of your regex  ".*" says:

.*

Match any single character that is not a line break character «.*»
   Between zero and unlimited times, as many times as possible, givingback as
needed (greedy) «*»

So, matching as much as it can, that will include all the digits except for the
single digit preceding the Y.

While you could certainly correct this by making the quantifier lazy (e.g.
".*?"), there's really no need for that at all.

You could accomplish your stated goal of capturing one or two digits, which are
followed by a Y, with the simpler regex:

"(\d{1,2})Y"

If you might have more than one such construct in a line, then with

Regex.global = true

myMatches.Count will give you the number of times that pattern was presentin
the line.
--ron- Hide quoted text -

- Show quoted text -
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top