Regex hangs

  • Thread starter Thread starter Vidar Skjelanger
  • Start date Start date
V

Vidar Skjelanger

I have a regex for matching VB6-functions, but it hangs on one
specific function.
The regex:

^(Public\s)?(Declare\s|Static\s)?(Function)\s(\S)+\(((Optional\s)?(ByVal\s|ByRef\s)?(ParamArray\s)?(\S)+(\sAs\s(\S)+)?(\s=\s(\S)+)?(\,\s)?)*\)(\sAs\s(\S)+)?

The string on which it hangs:

"Public Function FlaggAsOccupiedEx2(Pid As Long, UserName As String,
Workstation As String, Optional IsReadOnly As Boolean = False,
Optional Path As String = "", Optional Reason As String = "No
reason...") As Boolean"

A similar string, but lacking the last parameter, does not fail:

"Public Function FlaggAsOccupiedEx(Pid As Long, UserName As String,
Workstation As String, Optional IsReadOnly As Boolean = False,
Optional Path As String = "") As Boolean"

Any ideas? Is this a bug?
 
probably more likely that yo've got a circular backreference thingy in there or something, can't be arsed to analyse it though. Probably something to do with "lookbehind" (whatever that does).
 
Hi,
inline

Vidar Skjelanger said:
I have a regex for matching VB6-functions, but it hangs on one
specific function.
The regex:
^(Public\s)?(Declare\s|Static\s)?(Function)\s(\S)+\(((Optional\s)?(ByVal\s|B
yRef\s)?(ParamArray\s)?(\S)+(\sAs\s(\S)+)?(\s=\s(\S)+)?(\,\s)?)*\)(\sAs\s(\S
)+)?

Probely you don't want to use (\S)+ which captures a lot of characters,
instead of (\S+) which may capture words.
For "parameter name" and "parameter type" you want anything except space or
comma. [^\s,]
For optional value you do want spaces, but you don't want comma or ')'.
[^,\)]

With some minor changes the regex works again:

^(Public\s)?(Declare\s|Static\s)?(Function)\s(\S+)\(((Optional\s)?(ByVal\s|B
yRef\s)?(ParamArray\s)?([^\s,]+)(\sAs\s([^\s,]+))?(\s=\s([^,\)]+))?(\,\s)?)*
\)(\sAs\s(\S+))?


Depending on what you want to do it may not be usefull, because you can't
tell which optional value belongs to what parameter.
If you need better you can perform it in two fases, first get the
parameter-string, then use Regex.Matches to parse the parameter-string into
parameters, this way each parameter has it's own Match :

Match m = Regex.Match(strInput,
@"(?<public>Public\s)?(?<ds>Declare\s|Static\s)?Function\s(?<ftn>\S+)\((?<pa
rams>[^\)]*)\)(\sAs\s(?<ret>\S+)), RegexOptions.ExplicitCapture);

Console.WriteLine("Name={0} public={1} ds={2} ret-type={3}",
m.Groups["ftn"].Value,
m.Groups["public"].Success,
m.Groups["ds"].Value,
m.Groups["ret"].Value);

// parse parameters
MatchCollection mc = Regex.Matches(m.Groups["params"].Value,
@"(?<opt>Optional\s)?(?<mod>ByVal\s|ByRef\s)?(?<pa>ParamArray\s)?(?<name>[^\
s,]+)(\sAs\s(?<type>[^\s,]+))?(\s=\s(?<optv>[^,]+))?(,\s)?",
RegexOptions.ExplicitCapture);

foreach (Match p in mc)
{
Console.WriteLine("\tName={0} Modifier={1} ParamArray={2} Type={3}
Optional={4} OptionalValue={5}",
p.Groups["name"].Value,
p.Groups["mod"].Value,
p.Groups["pa"].Success,
p.Groups["type"].Value,
p.Groups["opt"].Success,
p.Groups["optv"].Value);
}


HTH,
greetings
 
BMermuys said:
Probely you don't want to use (\S)+ which captures a lot of characters,
instead of (\S+) which may capture words.
For "parameter name" and "parameter type" you want anything except space or
comma. [^\s,]
For optional value you do want spaces, but you don't want comma or ')'.
[^,\)]

With some minor changes the regex works again:

^(Public\s)?(Declare\s|Static\s)?(Function)\s(\S+)\(((Optional\s)?(ByVal\s|B
yRef\s)?(ParamArray\s)?([^\s,]+)(\sAs\s([^\s,]+))?(\s=\s([^,\)]+))?(\,\s)?)*
\)(\sAs\s(\S+))?

It works! Thank you! :)
 
Back
Top