More regular expression woes

M

Mark Rae

Hi,

This time, I'm looking for a regular expression which says "the string must
contain exactly seven or exactly eight digits" e.g.

123456 fails
1234567 passes
12345678 passes
123456789 fails

I've tried this:

\d{7,8}

but that allows 123456789 to pass, presumably because it contains a string
of seven or eight digits...

Is there any way to specifiy a fixed length to validate?

Any assistance gratefully received.

Mark
 
O

Oliver Sturm

Hello Mark,
I've tried this:

\d{7,8}

but that allows 123456789 to pass, presumably because it contains a string
of seven or eight digits...

Right. To do what you want you'll have to use a delimiter surrounding the
expression. ^ (start of line) and $ (end of line) could do, like this:

^\d{7,8}$

Of course, if you have strings that contain nothing but that number you're
looking at, it may be considerably more performant to look at the length
of the string combined with a simpler check that the characters are all
digits (leaving out regular expressions entirely).

Another delimiter that could be useful to you would be \b, which denotes a
"word boundary". For the exact definition you'd best look at the regular
expression docs on MSDN.

Yet another idea could be to delimit by characters that are not digits -
this makes sense and can be very helpful in more complex situations. Like
this:

[^\d]\d{7,8}[^\d]



Oliver Sturm
 
B

Brad Prendergast

Hi,

This time, I'm looking for a regular expression which says "the
string must contain exactly seven or exactly eight digits" e.g.

123456 fails
1234567 passes
12345678 passes
123456789 fails

I've tried this:

\d{7,8}

but that allows 123456789 to pass, presumably because it contains a
string of seven or eight digits...

Is there any way to specifiy a fixed length to validate?

Any assistance gratefully received.

Mark

Regex r;
r = new Regex("^\\d{7,8}$");
 
M

Mark Rae

Right. To do what you want you'll have to use a delimiter surrounding the
expression. ^ (start of line) and $ (end of line) could do, like this:

^\d{7,8}$
Supoib!

Of course, if you have strings that contain nothing but that number you're
looking at, it may be considerably more performant to look at the length
of the string combined with a simpler check that the characters are all
digits (leaving out regular expressions entirely).

Of course you're right about that. I'm finally (and I mean after nearly 20
years of programming!) trying to actually learn regular expressions
properly...

I'm using this: http://www.ultrapico.com/Expresso.htm which seems about the
best I've seen so far...
Another delimiter that could be useful to you would be \b, which denotes a
"word boundary". For the exact definition you'd best look at the regular
expression docs on MSDN.

Cool - thanks.
Yet another idea could be to delimit by characters that are not digits -
this makes sense and can be very helpful in more complex situations. Like
this:

[^\d]\d{7,8}[^\d]

Excellent - thanks again.
 
O

Oliver Sturm

Hello Mark,
Of course you're right about that. I'm finally (and I mean after nearly 20
years of programming!) trying to actually learn regular expressions
properly...

A very good idea, if you ask me :) Of course, learning the right
situations where to use them is just as important :) I think the context
of your use case is probably very important to make that decision.
I'm using this: http://www.ultrapico.com/Expresso.htm which seems about
the best I've seen so far...

There's also Regulator (http://sourceforge.net/projects/regulator/) which
is very good (better than Expresso IMO), but I've had pretty bad problems
in the past with the editor component used in that program and the author
Roy Osherove hasn't been responsive at all when I sent him bug reports.


Oliver Sturm
 
L

Laurent Bugnion [MVP]

Hi,

Oliver said:
Hello Mark,


A very good idea, if you ask me :) Of course, learning the right
situations where to use them is just as important :) I think the
context of your use case is probably very important to make that decision.


There's also Regulator (http://sourceforge.net/projects/regulator/)
which is very good (better than Expresso IMO), but I've had pretty bad
problems in the past with the editor component used in that program and
the author Roy Osherove hasn't been responsive at all when I sent him
bug reports.


Oliver Sturm

I like Regulator too. Didn't have issues so far (but I don't use it very
intensively).

Greetings,
Laurent
 
O

Oliver Sturm

Hello Laurent,
I like Regulator too. Didn't have issues so far (but I don't use it very
intensively).

The issues I mean are related to two things: (a) Changing regex options
while there's already an expression in the editor. This often makes sudden
changes to the expression for no apparent reason. (b) Cutting and pasting
text from/to the editor window. This makes changes to the text in places
that shouldn't be affected by the operation.

Not sure if there were more... I haven't used it recently and it's been a
while since I tried reporting the problems I saw. If somebody's
interested, I can dig up my original email(s?) and see what other
information I might have.


Oliver Sturm
 
J

John Kn [MS]

Brad,

Try this:
public static Regex regex = new Regex(
@"\b\d{7}\b|\b\d{8}\b",
RegexOptions.IgnoreCase
| RegexOptions.CultureInvariant
| RegexOptions.IgnorePatternWhitespace
| RegexOptions.Compiled
);

Do not let the \b expressions throw you. Additionally, you may want to
download Expresso from http://www.ultrapico.com/. It is the best regex
tutorial/utility I have found over the years.
--------------------
From: "Brad Prendergast" <format('bradp%sbpsoftware.com',['@'])>
Subject: Re: More regular expression woes
References: <[email protected]>
User-Agent: XanaNews/1.18.1.5
Message-ID: <[email protected]>
X-Ref: msnews.microsoft.com ~XNS:00000018
Newsgroups: microsoft.public.dotnet.languages.csharp
Date: Sun, 28 Jan 2007 09:11:35 -0800
NNTP-Posting-Host: c-76-19-190-126.hsd1.ma.comcast.net 76.19.190.126
Lines: 1
Path: TK2MSFTNGHUB02.phx.gbl!TK2MSFTNGP01.phx.gbl!TK2MSFTNGP03.phx.gbl
Xref: TK2MSFTNGHUB02.phx.gbl microsoft.public.dotnet.languages.csharp:11378
X-Tomcat-NG: microsoft.public.dotnet.languages.csharp

Hi,

This time, I'm looking for a regular expression which says "the
string must contain exactly seven or exactly eight digits" e.g.

123456 fails
1234567 passes
12345678 passes
123456789 fails

I've tried this:

\d{7,8}

but that allows 123456789 to pass, presumably because it contains a
string of seven or eight digits...

Is there any way to specifiy a fixed length to validate?

Any assistance gratefully received.

Mark

Regex r;
r = new Regex("^\\d{7,8}$");

{\rtf1\ansi\ansicpg1252\deff0\deflang1033{\fonttbl{\f0\fnil\fprq2\fcharset0
MS Sans Serif;}{\f1\fswiss\fcharset0 Arial;}}
{\*\generator Msftedit 5.41.21.2500;}\viewkind4\uc1\pard\f0\fs20 Cheers,\par
\par
johnKn [MS-SDK]\par
\par
\par
\par
-Please do not send email directly to this alias. This alias is for \par
newsgroup purposes only\par
\par
-This posting is provided "AS IS" with no warranties, and confers no
rights.\par
\par
-To provide additional feedback about your community experience please send
\par
e-mail to: (e-mail address removed)\par
\f1\par
}
 
O

Oliver Sturm

Hello John,
@"\b\d{7}\b|\b\d{8}\b",

An interesting additional idea, although obviously not really flexible - I
would opt for the flexibility of \d{7,8} as long as I can make it work.
Furthermore, your example could be less confusing, IMO, as

\b(\d{7}|\d{8})\b

Or of course, as

\b\d{7,8}\b

which brings us back to what has already been mentioned. Maybe I'm missing
the point of your post?


Oliver Sturm
 
J

John Kn [MS]

Oliver,

\b\d{7,8}\b works for me.

Thanks,

John
--------------------
From: "Oliver Sturm" <[email protected]>
Subject: Re: More regular expression woes
References: <[email protected]>
Date: Tue, 30 Jan 2007 19:39:36 +0000
User-Agent: XanaNews/1.18.1.3
Message-ID: <[email protected]>
X-Ref: msnews.microsoft.com ~XNS:00000864
MIME-Version: 1.0
Content-Type: text/plain; format=flowed
Newsgroups: microsoft.public.dotnet.languages.csharp
NNTP-Posting-Host: 83-216-141-166.oliver856.adsl.metronet.co.uk 83.216.141.166
Lines: 1
Path: TK2MSFTNGHUB02.phx.gbl!TK2MSFTNGP01.phx.gbl!TK2MSFTNGP04.phx.gbl
Xref: TK2MSFTNGHUB02.phx.gbl microsoft.public.dotnet.languages.csharp:11780
X-Tomcat-NG: microsoft.public.dotnet.languages.csharp

Hello John,


An interesting additional idea, although obviously not really flexible - I
would opt for the flexibility of \d{7,8} as long as I can make it work.
Furthermore, your example could be less confusing, IMO, as

\b(\d{7}|\d{8})\b

Or of course, as

\b\d{7,8}\b

which brings us back to what has already been mentioned. Maybe I'm missing
the point of your post?


Oliver Sturm

{\rtf1\ansi\ansicpg1252\deff0\deflang1033{\fonttbl{\f0\fnil\fprq2\fcharset0
MS Sans Serif;}{\f1\fswiss\fcharset0 Arial;}}
{\*\generator Msftedit 5.41.21.2500;}\viewkind4\uc1\pard\f0\fs20 Cheers,\par
\par
johnKn [MS-SDK]\par
\par
\par
\par
-Please do not send email directly to this alias. This alias is for \par
newsgroup purposes only\par
\par
-This posting is provided "AS IS" with no warranties, and confers no
rights.\par
\par
-To provide additional feedback about your community experience please send
\par
e-mail to: (e-mail address removed)\par
\f1\par
}
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top