Regex bug ?

B

BMermuys

Hi,

string test="dit is een test";
bool bMatch = Regex.IsMatch( test, "^([^wi]|)*$" );

after executing, the bMatch is false

The pattern is not optimized and seems a little stupid. But for large regex
expressions using smaller strings to build it, the same problem could arise.

First there is a character group which includes all character except 'wi',
but then there's an or with an 'i' character.

So as net result, only the w shouldn't be matched, the input string doesn't
contain a 'w', so it should match.

In perl this matches.

Can anyone conform this is a bug or I'm missing something ?

TIA
Greetings
 
B

Brian Davis

I have seen this problem with alternation before. To get around it, you can
use grouping parentheses for each part of the alternation:

^(([^wi])|())*$

Note that this may change your result if you reference unnamed groups. I
don't know why it works that way, but it definitely appears to be a bug in
..NET regular expressions.

Brian Davis
www.knowdotnet.com
 
J

Jhon

Hi,

Brian Davis said:
I have seen this problem with alternation before. To get around it, you can
use grouping parentheses for each part of the alternation:

^(([^wi])|())*$


Yes, I noticed, but the fun part is, that it only works for capturing
groups, if you would embed it in non-capturing groups(?:), it fails again...
Note that this may change your result if you reference unnamed groups. I
don't know why it works that way, but it definitely appears to be a bug in
.NET regular expressions.

I hope it gets fixed.

Thanks for the reply
greetings
Brian Davis
www.knowdotnet.com




BMermuys said:
Hi,

string test="dit is een test";
bool bMatch = Regex.IsMatch( test, "^([^wi]|)*$" );

after executing, the bMatch is false

The pattern is not optimized and seems a little stupid. But for large regex
expressions using smaller strings to build it, the same problem could arise.

First there is a character group which includes all character except 'wi',
but then there's an or with an 'i' character.

So as net result, only the w shouldn't be matched, the input string doesn't
contain a 'w', so it should match.

In perl this matches.

Can anyone conform this is a bug or I'm missing something ?

TIA
Greetings

 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Top