why does this regular expression match

T

Tony Johansson

Hi!

Here I say minimum 1 o and maximum three o but here I have more then three o
and this expression give true but
it should give false according to me ?

bool status = Regex.IsMatch("foooooood", "o{1,3}");

//Tony
 
W

Willem van Rumpt

Tony said:
Hi!

Here I say minimum 1 o and maximum three o but here I have more then three o
and this expression give true but
it should give false according to me ?

bool status = Regex.IsMatch("foooooood", "o{1,3}");

//Tony

Because your criteria has multiple matches:

f[ooo][ooo][o]d
 
T

Tony Johansson

Willem van Rumpt said:
Tony said:
Hi!

Here I say minimum 1 o and maximum three o but here I have more then
three o and this expression give true but
it should give false according to me ?

bool status = Regex.IsMatch("foooooood", "o{1,3}");

//Tony

Because your criteria has multiple matches:

f[ooo][ooo][o]d

But according to me means "o{1,3}" that I must have at least 1 o and maximum
of three o.
I meam what is the purpose of using a max of 3 here then ?

//Tony
 
A

Adhal

Willem van Rumpt said:
Tony said:
Hi!

Here I say minimum 1 o and maximum three o but here I have more then
three o and this expression give true but
it should give false according to me ?

bool status = Regex.IsMatch("foooooood", "o{1,3}");

//Tony

Because your criteria has multiple matches:

f[ooo][ooo][o]d

But according to me means "o{1,3}" that I must have at least 1 o and maximum
of three o.
I meam what is the purpose of using a max of 3 here then ?

//Tony

Hi Tony.
You have to specify that the previous/next letter can't be an "o".

I am not an expert on regex but this should work

[^o]o{1,3}[^o]
 
J

Jackie

Willem van Rumpt said:
Tony said:
Hi!

Here I say minimum 1 o and maximum three o but here I have more then
three o and this expression give true but
it should give false according to me ?

bool status = Regex.IsMatch("foooooood", "o{1,3}");

//Tony

Because your criteria has multiple matches:

f[ooo][ooo][o]d

But according to me means "o{1,3}" that I must have at least 1 o and maximum
of three o.
I meam what is the purpose of using a max of 3 here then ?

//Tony

http://www.regular-expressions.info/reference.html

By the way, a correction for my earlier reply: I said that my example
was the same as "o*?" but it would be the same as "o*" instead.
 
T

Tony Johansson

Adhal said:
Willem van Rumpt said:
Tony Johansson wrote:
Hi!

Here I say minimum 1 o and maximum three o but here I have more then
three o and this expression give true but
it should give false according to me ?

bool status = Regex.IsMatch("foooooood", "o{1,3}");

//Tony

Because your criteria has multiple matches:

f[ooo][ooo][o]d

But according to me means "o{1,3}" that I must have at least 1 o and
maximum
of three o.
I meam what is the purpose of using a max of 3 here then ?

//Tony

Hi Tony.
You have to specify that the previous/next letter can't be an "o".

I am not an expert on regex but this should work

[^o]o{1,3}[^o]

I don't want a solution I want to know why this regular expression does not
work as I expect.
But according to me means "o{1,3}" that I must have at least 1 o and maximum
of three o.
I meam what is the purpose of using a max of 3 here then ?

//Tony
 
A

Adhal

I meam what is the purpose of using a max of 3 here then ?

I think you misunderstood how to use it. It means that if it finds any string that contains 1 to 3
"0" it should return match,
but then it continues to search the rest of the string too see if there are any other matches. You
have to specify that the next letter can't be an "o". And my previous post is wrong.

It should be

o{1,3}[^o]*

and not

[^o]o{1,3}[^o]

as it's pointless. ^_^ The first [^o] is redundant.

You add the "*" (zero or more times) at the end to specify that it doesn't have to have another
character afterwards.
 
A

Adhal

Adhal said:
"Willem van Rumpt"<[email protected]> skrev i meddelandet
Tony Johansson wrote:
Hi!

Here I say minimum 1 o and maximum three o but here I have more then
three o and this expression give true but
it should give false according to me ?

bool status = Regex.IsMatch("foooooood", "o{1,3}");

//Tony

Because your criteria has multiple matches:

f[ooo][ooo][o]d


--
Willem van Rumpt

But according to me means "o{1,3}" that I must have at least 1 o and
maximum
of three o.
I meam what is the purpose of using a max of 3 here then ?

//Tony

Hi Tony.
You have to specify that the previous/next letter can't be an "o".

I am not an expert on regex but this should work

[^o]o{1,3}[^o]

I don't want a solution I want to know why this regular expression does not
work as I expect.
But according to me means "o{1,3}" that I must have at least 1 o and maximum
of three o.
I meam what is the purpose of using a max of 3 here then ?

//Tony

I just corrected myself before reading this post. Check out the other post. But basically when it
finds a match it start to find the next match. You did not specify that the next character can't be
an "o". You have to be more specific.

So it found the first match and then it starts again to find the next match. Both matches are valid.

Does that make sense.
 
A

Alberto Poblacion

Tony Johansson said:
I don't want a solution I want to know why this regular expression does
not work as I expect.
But according to me means "o{1,3}" that I must have at least 1 o and
maximum
of three o.

No, that's not correct. "o{1,3}" means that "somewhere in the middle of
the string there must be a sequence consisting of at least one and at most
three o's. The rest of the string may be anythig (including more o's)".
I meam what is the purpose of using a max of 3 here then ?

In this particular instance, the max is useless. But it can be useful
when there are more things in the expression. For instance, "Ao{1,3}B" would
match "AoooB", but it would not match "AooooB".
 
W

Willem van Rumpt

Tony said:
But according to me means "o{1,3}" that I must have at least 1 o and maximum
of three o.
I meam what is the purpose of using a max of 3 here then ?

//Tony

The purpose of three is "find me a sequence of 'o' characters, at least
1, and at most three long". And that's what it does. You never instruct
the regex to stop searching beyond the first match, so it matches
everything it can find. You also don't tell it that the it has to be a
sequence of "o" embedded within non "o"'s (and even then: should
"fooodod" result in two matches?).

I think the regex will look something like

[!^o][o{1,3}][!^o]

but I'm almost totally incompetent with regexes (I usually use an online
reference and regex tester to evaluate what I get). You're best bets for
a really appropriate answer are reading up on the subject, or wait until
a regex guru answers you.
 
T

Tony Johansson

Willem van Rumpt said:
Tony said:
But according to me means "o{1,3}" that I must have at least 1 o and
maximum of three o.
I meam what is the purpose of using a max of 3 here then ?

//Tony

The purpose of three is "find me a sequence of 'o' characters, at least 1,
and at most three long". And that's what it does. You never instruct the
regex to stop searching beyond the first match, so it matches everything
it can find. You also don't tell it that the it has to be a sequence of
"o" embedded within non "o"'s (and even then: should "fooodod" result in
two matches?).

I think the regex will look something like

[!^o][o{1,3}][!^o]

but I'm almost totally incompetent with regexes (I usually use an online
reference and regex tester to evaluate what I get). You're best bets for a
really appropriate answer are reading up on the subject, or wait until a
regex guru answers you.

As you say The purpose of three is "find me a sequence of 'o' characters,
at least
1, and at most three long"

So can I say that the expression means find at least one 'o' at most three
'o' anywhere in the string.
It could be at the very beginning in the middle or at the very end what the
other character are doesn't matter.

//Tony
 
J

Jeff Johnson

Here I say minimum 1 o and maximum three o but here I have more then three
o and this expression give true but
it should give false according to me ?

bool status = Regex.IsMatch("foooooood", "o{1,3}");

The main problem is that you're not understanding how the {m,n} qualifier
works. Specifically, you're reading too much into its abilities. You
shouldn't think of {m,n} as meaning "AT LEAST m occurrences and AT MOST n
occurrences" but rather "FROM m TO n occurrences." The difference is subtle
but important. The first phrase suggests that there must be some limitation
of the letter "o" in your regex above, which is how you were interpreting
it. The second just says "as long as I can find anywhere from 1 to 3 o's,
I'm happy." And the regex engine can definitely find from 1 to 3 o's. In
fact, it can do it three times: 2 sets of 3 and 1 single o.

In order to do what you thought it would do, your best bet is to use an
advanced regex technique called lookaround. Lookaround allows you to look at
characters before your intended match or after it and require that a pattern
either be present or absent. Requiring a pattern to be present is called
positive lookaround, while requiring it to be absent is negative lookaround.
Actually, you don't usually say "lookaround" but rather you indicate the
direction, so you can have positive or negative lookbehind (searching for a
pattern before your match pattern) and positive and negative lookahead
(searching for a pattern after your match pattern).

To get 3 and only 3 o's, you'd need to both look behind and look ahead to
make sure that there isn't a 4th o either before or after your pattern. The
regex would look like this:

(?<!o)o{1,3}(?!o)

This WILL find matches in the following input:

fod
food
foood
Zoo
to
oops
hotfoot

It will NOT find matches in this input:

foooooood
Zooool
ooooooh! aaaaah!
 
W

Willem van Rumpt

Tony said:
As you say The purpose of three is "find me a sequence of 'o' characters,
at least
1, and at most three long"

So can I say that the expression means find at least one 'o' at most three
'o' anywhere in the string.
It could be at the very beginning in the middle or at the very end what the
other character are doesn't matter.

//Tony

Correct.
Also note that it tries to match as much as it can, thus returning 3
matches in "foooooood" (7 'o's), instead of 7 matches.
 
J

Jeff Johnson

Correct.
Also note that it tries to match as much as it can, thus returning 3
matches in "foooooood" (7 'o's), instead of 7 matches.

For completeness, this behavior can be altered by adding "?" after the
qualifier to make it lazy:

{1,3}?

Then there would have been seven matches of single o's.
 
H

Harlan Messinger

Tony said:
Willem van Rumpt said:
Tony said:
Hi!

Here I say minimum 1 o and maximum three o but here I have more then
three o and this expression give true but
it should give false according to me ?

bool status = Regex.IsMatch("foooooood", "o{1,3}");

//Tony
Because your criteria has multiple matches:

f[ooo][ooo][o]d

But according to me means "o{1,3}" that I must have at least 1 o and maximum
of three o.

It means that your string must have a sequence of between 1 and 3 o's.
Even a string of 500 o's *includes a sequence of between 1 and 3 o's*.
The pattern "o{1,3}" means "match 1 through 3 o's", not "match 1 through
3 o's unless they are followed by another o".
I meam what is the purpose of using a max of 3 here then ?

In your example it doesn't serve any purpose. It serves a purpose if
it's followed by something. If your pattern had been "o{1,3}d", then it
would match "fod", "food", and "foood" but not "fooooooood".
 
H

Harlan Messinger

Harlan said:
Tony said:
Willem van Rumpt said:
Tony Johansson wrote:
Hi!

Here I say minimum 1 o and maximum three o but here I have more then
three o and this expression give true but
it should give false according to me ?

bool status = Regex.IsMatch("foooooood", "o{1,3}");

//Tony
Because your criteria has multiple matches:

f[ooo][ooo][o]d

But according to me means "o{1,3}" that I must have at least 1 o and
maximum of three o.

It means that your string must have a sequence of between 1 and 3 o's.
Even a string of 500 o's *includes a sequence of between 1 and 3 o's*.
The pattern "o{1,3}" means "match 1 through 3 o's", not "match 1 through
3 o's unless they are followed by another o".
I meam what is the purpose of using a max of 3 here then ?

In your example it doesn't serve any purpose. It serves a purpose if
it's followed by something. If your pattern had been "o{1,3}d", then it
would match "fod", "food", and "foood" but not "fooooooood".
Forget what I just said. It wouldn't make any difference there either.
For the maximum value to make a difference, there has to be something
else both before AND after the quantified item in the pattern. Pretend
I'd written "fo{1,3}d".
 
J

Jeff Johnson

Forget what I just said. It wouldn't make any difference there either. For
the maximum value to make a difference, there has to be something else
both before AND after the quantified item in the pattern. Pretend I'd
written "fo{1,3}d".

It does make a difference, and in certain circumstances (no examples come
immediately to mind) you might want it: This pattern creates multiple
matches with each match being at most three o's.
 
H

Harlan Messinger

Jeff said:
It does make a difference, and in certain circumstances (no examples come
immediately to mind) you might want it: This pattern creates multiple
matches with each match being at most three o's.
Ah, you're right, in the context of a scan for *all* matching
substrings. I was thinking from the perspective of testing whether the
string has *a* matching substring.
 
J

Jeff Johnson

Ah, you're right, in the context of a scan for *all* matching substrings.
I was thinking from the perspective of testing whether the string has *a*
matching substring.

And in that context you are absolutely right.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Top