Best approach to validating "set" property ..

A

Anders Borum

Hello!

I have a class that needs to validate the input value, when a programmer
changes a specific property on a class. The input should only accept the
following pattern [a-zA-Z0-9]{1,n} (alpha-numeric with atleast one entry).

I'm a little resistant to implementing a regular expression validation,
because of the overhead involved (not because I can't :). Obviously, it's
possible to implement the pattern validation using regular string ops ..

What should I choose? I imagine the access frequency is rather low, but I
don't want to go the wrong direction - also in terms of future API
development.

// Class definition omitted for brevity..

/// <summary>
/// Gets or sets the Area name. The Area name must be a fully
/// qualified ASP.NET control id in the form of [a-zA-Z0-9].
/// </summary>
public string AreaName
{
get { return this.areaName; }
set
{
// include validation here ..
// RegEx or string-ops?
this.areaName = value;
}
}

Thanks in advance!
 
J

Jon Skeet [C# MVP]

Anders Borum said:
I have a class that needs to validate the input value, when a programmer
changes a specific property on a class. The input should only accept the
following pattern [a-zA-Z0-9]{1,n} (alpha-numeric with atleast one entry).

I'm a little resistant to implementing a regular expression validation,
because of the overhead involved (not because I can't :). Obviously, it's
possible to implement the pattern validation using regular string ops ..

What should I choose? I imagine the access frequency is rather low, but I
don't want to go the wrong direction - also in terms of future API
development.

If the regular expression is simple to read, but the "hand-crafted"
code wouldn't be very simple, go with the regular expression until you
have evidence that it's being too slow. If you end up with a monster
regular expression that can easily be expressed in actual code, leave
it as code.
 
A

Anders Borum

Hello Jon

(and thanks for the quick reply)

I was thinking of the RegEx approach because of the clean code (and
incredible flexibility in terms of changes the RegEx pattern), compared to
the more verbose (and harder to read) string ops.

As I said, the property is unlikely to be used extensively, so the RegEx is
probably the best approach at the moment. Thanks for your thoughts.
 
D

David Browne

Anders Borum said:
Hello Jon

(and thanks for the quick reply)

I was thinking of the RegEx approach because of the clean code (and
incredible flexibility in terms of changes the RegEx pattern), compared to
the more verbose (and harder to read) string ops.
This one is easy.

private static bool IsAlphaNumeric(string s)
{
foreach (char c in s)
{
if (!((c >= '0' && c <= '9') || (c >= 'a' && c <= 'z') || (c >= 'A' &&
c <= 'Z')))
{
return false;
}
}
return true;
}

And much faster than a regex, which can be important when you are desiging a
library and you don't now know often each method will be executed.

David
 
A

Anders Borum

Hello David

Actually, as I indicated the property is unlikely to be used extensively,
but I'm naturally always on the lookout for the best approach.

If you examine the following RegEx, you'll see that the length needs to be
one or more, and always start with one of the following chars "a-z_", then
followed by zero or more alpha-numeric chars (including underscore).

if (Regex.IsMatch(value, @"^[a-z_]\w*$", RegexOptions.IgnoreCase) == false)

How about that one? ;-)
 
A

Anders Borum

Hello John

In trying to develop strong coding skills (and patterns), I'm asking for a
constructive discussion on the following two code examples with equal
functionality, but different implementation.

If you were a programmer looking over the API, would you find it difficult
to understand the second example? I'm not sure if the output IL would yield
a big performance difference (comments please).

Which would you choose?

/// <summary>
/// Determines if the Area has any child Areas. The disconnected state
/// does not reflect changes to the persisted Area (e.g. if other users have
/// changed the child Area after the current object was initialized).
/// </summary>
public bool HasAreas
{
get
{
bool retVal;

// Collection not cached (initialized from db)
if (areas == null)
{
// Use the Area reference count from db
retVal = (areaRefs > 0);
}
else
{
// Use cached collection (previously initialized from db)
retVal = (areas.Count > 0);
}

return retVal;
}
}

In contrast to this ..

/// <summary>
/// Determines if the Area has any child Areas. The disconnected state
/// does not reflect changes to the persisted Area (e.g. if other users have
/// changed the child Area after the current object was initialized).
/// </summary>
public bool HasAreas
{
get { return (this.areas == null) ? (this.areaRefs > 0) :
(this.areas.Count > 0); }
}

Again - thank you!
 
D

David Browne

Anders Borum said:
Hello David

Actually, as I indicated the property is unlikely to be used extensively,
but I'm naturally always on the lookout for the best approach.

If you examine the following RegEx, you'll see that the length needs to be
one or more, and always start with one of the following chars "a-z_", then
followed by zero or more alpha-numeric chars (including underscore).

if (Regex.IsMatch(value, @"^[a-z_]\w*$", RegexOptions.IgnoreCase) == false)

private static bool IsAlphaNumeric(string s)
{

if (s.Length == 0)
{
return false;
}
char c = s[0];
if (!((c >= 'a' && c <= 'z') || c = '_'))
{
return false;
}
for (int i = 1; i < s.Length; i++)
{
c = s;
if (!((c >= '0' && c <= '9') ||
(c >= 'a' && c <= 'z') ||
(c >= 'A' && c <= 'Z') ||
c == '_'))
{
return false;
}
}
return true;
}

But notice how this program "looks" like the regex. A Regex is just a
recipe for a program. In fact a regex IS a program. You just have to
decide whether to implement it in "regex" or in C#.

For trivial programs, stick with C#. It's faster and you need a compelling
reason to mix programming languages in a project. The more complicated the
pattern, the better of you are with a Regex. But on the other hand, regex's
lack any mechanism for encapsulation, so as the regex gets even more
complicated, hand coding can be simpler.

David
 
J

Jon Skeet [C# MVP]

Anders Borum said:
In trying to develop strong coding skills (and patterns), I'm asking for a
constructive discussion on the following two code examples with equal
functionality, but different implementation.

If you were a programmer looking over the API, would you find it difficult
to understand the second example? I'm not sure if the output IL would yield
a big performance difference (comments please).

Which would you choose?

I would either choose the latter, or the following:

// Is the collection cached (previously initialized from db)?
bool cached = (areas!=null);

return ( (cached && areas.Count > 0) || (!cached && areaRefs > 0) );

Ultimately either is pretty readable - although I'd comment the second
version slightly more, I think.
 
A

Anders Borum

Hello Jon

Thanks for the effords so far. It's been quite interesting.
I would either choose the latter, or the following:

// Is the collection cached (previously initialized from db)?
bool cached = (areas!=null);

return ( (cached && areas.Count > 0) || (!cached && areaRefs > 0) );

Isn't the following code faster? You're making use of a variable, and I was
under the impression that the "? :" syntax is an highly optimized pattern?

public bool HasAreas
{
get { return (areas == null) ? (areaRefs > 0) : (areas.Count > 0); }
}
Ultimately either is pretty readable - although I'd comment the second
version slightly more, I think.

Yes, I'm a big fan of commenting source-code. You never know, when (or who)
you're going to review it.
 
J

Jon Skeet [C# MVP]

Anders Borum said:
Isn't the following code faster? You're making use of a variable, and I was
under the impression that the "? :" syntax is an highly optimized pattern?

public bool HasAreas
{
get { return (areas == null) ? (areaRefs > 0) : (areas.Count > 0); }
}

I would certainly hope that they'd be optimised to roughly the same
code, and I'd want to see *very* definite proof that the optimisation:

a) worked
b) was absolutely necessary

before sacrificing even a *jot* of readability.
Yes, I'm a big fan of commenting source-code. You never know, when (or who)
you're going to review it.

Indeed.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top