regular expression NxM

A

Anees

Hi,

I am using following regular expression to validate picture size take
as input (either an integer or NxM where N and M are integers).

(\d+)|(\d+\s*[Xx]\s*\d+)

This regular expression is working fine in perl as intended but not in
dot net. Please tell me to do it in dot net.


Thank you.



Regards,
Anees Haider
 
A

Anees

Anees said:
I am using following regular expression to validate picture size take
as input (either an integer or NxM where N and M are integers).
(\d+)|(\d+\s*[Xx]\s*\d+)

This regular expression is working fine in perl as intended but not in
dot net. Please tell me to do it in dot net.

Can you be more specific about how it doesn't work for you?  It works
fine for me.

Assuming you are actually using literally that expression, the only
thing that comes to mind is incorrect escaping of the backslash
character.  But that would generate a compiler error, which a) a person
would typically mention in their question (especially if they don't
bother to provide any actual code), and b) should not be hard to fix by
oneself.

See below for a short example of your expression in a
concise-but-complete code example that works correctly for me.

If the problem is in fact not related to incorrect escaping of the
backslash character, and you cannot figure out what the problem actually
is by looking at the code example I've provided, please state your
question more clearly and less ambiguously, with specific details about
what exactly is not working.  You should include a concise-but-complete
code example that reliably demonstrates the problem.

Pete

using System;
using System.Text.RegularExpressions;

namespace TestRegexIntOrIntXInt
{
     class Program
     {
         static void Main(string[] args)
         {
             Regex regex = new Regex(@"(\d+)|(\d+\s*[Xx]\s*\d+)");
             string strInput;

             while ((strInput = Console.ReadLine()) != "")
             {
                 Console.WriteLine("Input \"{0}\" is {1}", strInput,
regex.Match(strInput).Success ? "valid" : "INVALID");
             }
         }
     }

}

Thanks you for your response. I was using that expression in
RegularExpressionValidator for asp.net page, as it is. It appears
regular expression OR "|" operator doesn't work in
RegularExpressionValidator control. So I changed the regular
expression to following equivalent:

\d+([Xx]\d+)?


Thanks you.

Regards,
Anees Haider
 
K

kndg

[...] I was using that expression in
RegularExpressionValidator for asp.net page, as it is. It appears
regular expression OR "|" operator doesn't work in
RegularExpressionValidator control. So I changed the regular
expression to following equivalent:

\d+([Xx]\d+)?

Hi Anees,

At first, I thought that would be a bug in RegularExpressionValidator
control (since it is using the same System.Text.RegularExpressions.Regex
class, there is no way it couldn't handle "|" operator correctly), but
after checking through it source code, the problem is because the way it
handles matches (hence the problem with your regex pattern). Here is the
portion of the source code,

protected override bool EvaluateIsValid()
{

// Always succeeds if input is empty or value was not found
string controlValue = GetControlValidationValue(ControlToValidate);
Debug.Assert(controlValue != null, "Should have already been
checked");
if (controlValue == null || controlValue.Trim().Length == 0)
{
return true;
}

try
{
// we are looking for an exact match, not just a search hit
Match m = Regex.Match(controlValue, ValidationExpression);
return (m.Success && m.Index == 0 && m.Length ==
controlValue.Length);
}
catch
{
Debug.Fail("Regex error should have been caught in property
setter.");
return true;
}
}

Your regex pattern is

(\d+)|(\d+\s*[Xx]\s*\d+)

Since it use Regex.Match to get the match, when pass the input as
"100x100", Regex.Match is succeed (the matched value is "100"),
m.Success and m.Index == 0 returns true, but the m.Length ==
input.Length returns false which make the above EvaluateIsValid() method
returning false, hence the input failed the validation (which leads you
to think that the control does not handle "|" correctly).

If you modify your regex to include anchors, it would work just fine.

(^\d+$)|(^\d+\s*[Xx]\s*\d+$)

Regards.
 
A

Anees

[...]  I was using that expression in
RegularExpressionValidator for asp.net page, as it is. It appears
regular expression OR "|" operator doesn't work in
RegularExpressionValidator control. So I changed the regular
expression to following equivalent:
\d+([Xx]\d+)?

Hi Anees,

At first, I thought that would be a bug in RegularExpressionValidator
control (since it is using the same System.Text.RegularExpressions.Regex
class, there is no way it couldn't handle "|" operator correctly), but
after checking through it source code, the problem is because the way it
handles matches (hence the problem with your regex pattern). Here is the
portion of the source code,

     protected override bool EvaluateIsValid()
     {

       // Always succeeds if input is empty or value was not found
       string controlValue = GetControlValidationValue(ControlToValidate);
       Debug.Assert(controlValue != null, "Should have already been
checked");
       if (controlValue == null || controlValue.Trim().Length== 0)
       {
         return true;
       }

       try
       {
         // we are looking for an exact match, not just a search hit
         Match m = Regex.Match(controlValue, ValidationExpression);
         return (m.Success && m.Index == 0 && m.Length ==
controlValue.Length);
       }
       catch
       {
         Debug.Fail("Regex error should have been caught in property
setter.");
         return true;
       }
     }

Your regex pattern is

(\d+)|(\d+\s*[Xx]\s*\d+)

Since it use Regex.Match to get the match, when pass the input as
"100x100", Regex.Match is succeed (the matched value is "100"),
m.Success and m.Index == 0 returns true, but the m.Length ==
input.Length returns false which make the above EvaluateIsValid() method
returning false, hence the input failed the validation (which leads you
to think that the control does not handle "|" correctly).

If you modify your regex to include anchors, it would work just fine.

(^\d+$)|(^\d+\s*[Xx]\s*\d+$)

Regards.

Thanks, kndg for clarification, but, shouldn't the control be written
in such a way that we won't have to go through its code whenever we
are using it :), or shouldn't it confirm to regular expression
standards, i.e. validating what is represented by regular expression.
 
K

kndg

[...] I was using that expression in
RegularExpressionValidator for asp.net page, as it is. It appears
regular expression OR "|" operator doesn't work in
RegularExpressionValidator control. So I changed the regular
expression to following equivalent:
\d+([Xx]\d+)?

Hi Anees,

At first, I thought that would be a bug in RegularExpressionValidator
control (since it is using the same System.Text.RegularExpressions.Regex
class, there is no way it couldn't handle "|" operator correctly), but
after checking through it source code, the problem is because the way it
handles matches (hence the problem with your regex pattern). Here is the
portion of the source code,

protected override bool EvaluateIsValid()
{

// Always succeeds if input is empty or value was not found
string controlValue = GetControlValidationValue(ControlToValidate);
Debug.Assert(controlValue != null, "Should have already been
checked");
if (controlValue == null || controlValue.Trim().Length == 0)
{
return true;
}

try
{
// we are looking for an exact match, not just a search hit
Match m = Regex.Match(controlValue, ValidationExpression);
return (m.Success&& m.Index == 0&& m.Length ==
controlValue.Length);
}
catch
{
Debug.Fail("Regex error should have been caught in property
setter.");
return true;
}
}

Your regex pattern is

(\d+)|(\d+\s*[Xx]\s*\d+)

Since it use Regex.Match to get the match, when pass the input as
"100x100", Regex.Match is succeed (the matched value is "100"),
m.Success and m.Index == 0 returns true, but the m.Length ==
input.Length returns false which make the above EvaluateIsValid() method
returning false, hence the input failed the validation (which leads you
to think that the control does not handle "|" correctly).

If you modify your regex to include anchors, it would work just fine.

(^\d+$)|(^\d+\s*[Xx]\s*\d+$)

Regards.

Thanks, kndg for clarification, but, shouldn't the control be written
in such a way that we won't have to go through its code whenever we
are using it :), or shouldn't it confirm to regular expression
standards, i.e. validating what is represented by regular expression.

Yes, agreed and I think this is the situation where MS developers trying
to be overly helpful. And since this RegularExpressionValidator control
already exist since framework 1.0, I think they are unable to modify
back the code logic (otherwise it would probably break many web
applications that use it). But the good thing is you could always derive
your own class from RegularExpressionValidator and override the
EvaluateIsValid() method (such as make it fail when the input is empty,
or make it follow the regular regex standard).

Anyway, since MS has exposed their .Net framework source code, I will
take advantage of it when facing the problem.

Regards.
 
A

Anees

I would argue that there is an implicit requirement that the control do
exactly what it's doing.  The control has no way to know whether your
regex expression must match the entire string or simply satisfy a normal
regex match.

If it's sufficient for the text to simply contain the pattern presented
in your regex expression, then your original expression would be fine.
But if you expect the entire text of the control to match the expression
exactly (as I believe you do in this case), then you need to write the
expression precisely to indicate that.

In other words, the control _is_ "conforming to regular expression
standards".  It's validating the text exactly as you've asked it to, and
your expression does not require that there not be anything else in the
text except what the expression indicates.

To do it otherwise would create a situation in which someone trying to
write a validating expression that simply needed to match some text
within the string, rather than force the entire text to match, would be
unable to accomplish that.  The current design and implementation of the
control is more flexible, and more importantly does not arbitrarily
exclude certain use cases.

Just because _you_ want the expression to validate only if it matches
the entire string, that doesn't mean that _everyone_ needs the control
to work that way.  And it's simple enough to create a proper regex
expression that does what you want.

Pete

If it confirms to regular expression standard, then why it is working
with one representation and not working with its another elongate
representation... shouldn't the behavior be consistent ?

\d+([Xx]\d+)?
(\d+)|(\d+[Xx]\d+)


Regards,
Anees Haider
 
A

Anees

I would argue that there is an implicit requirement that the control do
exactly what it's doing.  The control has no way to know whether your
regex expression must match the entire string or simply satisfy a normal
regex match.

If it's sufficient for the text to simply contain the pattern presented
in your regex expression, then your original expression would be fine.
But if you expect the entire text of the control to match the expression
exactly (as I believe you do in this case), then you need to write the
expression precisely to indicate that.

In other words, the control _is_ "conforming to regular expression
standards".  It's validating the text exactly as you've asked it to, and
your expression does not require that there not be anything else in the
text except what the expression indicates.

To do it otherwise would create a situation in which someone trying to
write a validating expression that simply needed to match some text
within the string, rather than force the entire text to match, would be
unable to accomplish that.  The current design and implementation of the
control is more flexible, and more importantly does not arbitrarily
exclude certain use cases.

Just because _you_ want the expression to validate only if it matches
the entire string, that doesn't mean that _everyone_ needs the control
to work that way.  And it's simple enough to create a proper regex
expression that does what you want.

Pete

If it confirms to regular expression standard, then why it is working
with one representation and not working with its another elongate
representation... shouldn't the behavior be consistent ?

\d+([Xx]\d+)?
(\d+)|(\d+[Xx]\d+)


Regards,
Anees Haider
 
A

Anees

I would argue that there is an implicit requirement that the control do
exactly what it's doing.  The control has no way to know whether your
regex expression must match the entire string or simply satisfy a normal
regex match.

If it's sufficient for the text to simply contain the pattern presented
in your regex expression, then your original expression would be fine.
But if you expect the entire text of the control to match the expression
exactly (as I believe you do in this case), then you need to write the
expression precisely to indicate that.

In other words, the control _is_ "conforming to regular expression
standards".  It's validating the text exactly as you've asked it to, and
your expression does not require that there not be anything else in the
text except what the expression indicates.

To do it otherwise would create a situation in which someone trying to
write a validating expression that simply needed to match some text
within the string, rather than force the entire text to match, would be
unable to accomplish that.  The current design and implementation of the
control is more flexible, and more importantly does not arbitrarily
exclude certain use cases.

Just because _you_ want the expression to validate only if it matches
the entire string, that doesn't mean that _everyone_ needs the control
to work that way.  And it's simple enough to create a proper regex
expression that does what you want.

Pete

If it confirms to regular expression standard, then why it is working
with one representation and not working with its another elongate
representation... shouldn't the behavior be consistent ?

\d+([Xx]\d+)?
(\d+)|(\d+[Xx]\d+)


Regards,
Anees Haider
 
A

Anees

thanks kndg for the nice and helpful solutions of inheritance and code
walk-through. I know it will help me a lot in future, for such issues.

Regards,
Anees Haider
On 7/29/2010 3:41 PM, Anees wrote:

[...] I was using that expression in
RegularExpressionValidator for asp.net page, as it is. It appears
regular expression OR "|" operator doesn't work in
RegularExpressionValidator control. So I changed the regular
expression to following equivalent:

\d+([Xx]\d+)?

Hi Anees,

At first, I thought that would be a bug in RegularExpressionValidator
control (since it is using the same System.Text.RegularExpressions.Regex
class, there is no way it couldn't handle "|" operator correctly), but
after checking through it source code, the problem is because the way it
handles matches (hence the problem with your regex pattern). Here is the
portion of the source code,

protected override bool EvaluateIsValid()
{

// Always succeeds if input is empty or value was not found
string controlValue = GetControlValidationValue(ControlToValidate);
Debug.Assert(controlValue != null, "Should have already been
checked");
if (controlValue == null || controlValue.Trim().Length == 0)
{
return true;
}

try
{
// we are looking for an exact match, not just a search hit
Match m = Regex.Match(controlValue, ValidationExpression);
return (m.Success&& m.Index == 0&& m.Length ==
controlValue.Length);
}
catch
{
Debug.Fail("Regex error should have been caught in property
setter.");
return true;
}
}

Your regex pattern is

(\d+)|(\d+\s*[Xx]\s*\d+)

Since it use Regex.Match to get the match, when pass the input as
"100x100", Regex.Match is succeed (the matched value is "100"),
m.Success and m.Index == 0 returns true, but the m.Length ==
input.Length returns false which make the above EvaluateIsValid() method
returning false, hence the input failed the validation (which leads you
to think that the control does not handle "|" correctly).

If you modify your regex to include anchors, it would work just fine.

(^\d+$)|(^\d+\s*[Xx]\s*\d+$)

Regards.

Thanks, kndg for clarification, but, shouldn't the control be written
in such a way that we won't have to go through its code whenever we
are using it :), or shouldn't it confirm to regular expression
standards, i.e. validating what is represented by regular expression.

Yes, agreed and I think this is the situation where MS developers trying
to be overly helpful. And since this RegularExpressionValidator control
already exist since framework 1.0, I think they are unable to modify
back the code logic (otherwise it would probably break many web
applications that use it). But the good thing is you could always derive
your own class from RegularExpressionValidator and override the
EvaluateIsValid() method (such as make it fail when the input is empty,
or make it follow the regular regex standard).

Anyway, since MS has exposed their .Net framework source code, I will
take advantage of it when facing the problem.

Regards.
 
A

Anees

Anees said:
If it confirms to regular expression standard, then why it is working
with one representation and not working with its another elongate
representation... shouldn't the behavior be consistent ?
\d+([Xx]\d+)?
(\d+)|(\d+[Xx]\d+)

I have no idea what you mean.  The expression you posted works fine, but
is simply too permissive.  But it's too permissive because that's how
you wrote it.  If you use the expression "kndg" provided, it's not as
permissive (and presumably behaves as you like), because that expression
is the actual correct expression of your intent (no pun intended).

As for the expression language itself, if you are concerned about some
deviation of .NET regex syntax from other regex languages, well for
better or worse, that's just how the world is.  There is no "standard
regex language", so you'll have to learn the nuances found in each
implementation of regex in order to use them effectively.

Pete

what I mean is that same class of strings are validating from
following regex,

a) \d+([Xx]\d+)?
b) (\d+)|(\d+[Xx]\d+)

shouldn't the RegularExpressionValidator behavior be consistent ?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top