Regular expression - misbehaviour

  • Thread starter Thread starter Mark
  • Start date Start date
M

Mark

Hi,

why oh why doesn't this work as per any other language I
have used regexes with;
--------
Regex r = new Regex("[A-Z]{2}");
bool test = r.IsMatch("ACD");
--------

The boolean 'test' is true, despite the {2} quantifier
specifies 'exactly two'. Why is this?

Perl/ Java performs this match correctly - e.g. false. Is
this a bad implementation of regexes? Or am I missing
something?

Mark
 
I disagree. I don't know Perl, but I would be horrified if Java didn't
match this as well.

Yes, your expression states that you want to match exactly two
uppercase letters, but remember that the pattern matcher returns true
if the pattern occurs _anywhere within_ the input string. So, your test
matches two places in the string: the "AC" of "ACD", and the "CD" of
"ACD", so of course it returns true.

To state that you want the pattern to have to match the whole string,
you need to say,

Regex r = new Regex("^[A-Z]{2}$");

do you not?
 
I disagree. I don't know Perl, but I would be horrified if Java didn't
match this as well.

Yes, your expression states that you want to match exactly two
uppercase letters, but remember that the pattern matcher returns true
if the pattern occurs _anywhere within_ the input string. So, your test
matches two places in the string: the "AC" of "ACD", and the "CD" of
"ACD", so of course it returns true.

To state that you want the pattern to have to match the whole string,
you need to say,

Regex r = new Regex("^[A-Z]{2}$");

do you not?
 
Thanks for this. Java 1.4's Pattern object compiles the
expression, and treats it as a complete pattern match e.g.
it seems to imply the ^ and $ is there and tests for the
whole pattern regardless using the Matcher object. Not
sure why this is though??

I agree inclusion of ^ and $ is standard Regex stuff, and
C# works OK.
 
After a long weekend of working with regular expressions and wanting to take
a bottle of pills by the end, I ran across "The Regulator" - you put in your
critieria and it spits out the regular expression

http://regex.osherove.com/

Seems to me, it's very easy to have human-error with regular expressions
because they are so non-intuitive..
 
That's odd. How would you find out if a pattern matches any part of a
string using Java's pattern matching routines, then? Do you have to put
..* before and after the pattern?
 
Back
Top