Newbie question about Regex

B

Big Daddy

I am new at regular expressions and can't get this to work quite right
for some reason. I want to check if a string matches the pattern
"asdf*.zip" where * is a wildcard and could be anything. But it
should start with "asdf" and end with ".zip" For example, this code
snippet shows what I mean:

Regex regex = new Regex(@"something");
bool b;
b = regex.IsMatch("asdf.zip"); // b would be true
b = regex.IsMatch("3asdf.zip"); // b would be false
b = regex.IsMatch("asdf.zip3"); // b would be false
b = regex.IsMatch("asdf!@#$%^&*().zip"); // b would be true
b = regex.IsMatch("asdfwwwwwwwww.zip"); // b would be true

In the first line of code, what would I replace "something" with in
order for the examples to work out?

thanks in advance,
John
 
M

mp

Big Daddy said:
I am new at regular expressions and can't get this to work quite right
for some reason. I want to check if a string matches the pattern
"asdf*.zip" where * is a wildcard and could be anything. But it
should start with "asdf" and end with ".zip" For example, this code
snippet shows what I mean:

Regex regex = new Regex(@"something");
bool b;
b = regex.IsMatch("asdf.zip"); // b would be true
b = regex.IsMatch("3asdf.zip"); // b would be false
b = regex.IsMatch("asdf.zip3"); // b would be false
b = regex.IsMatch("asdf!@#$%^&*().zip"); // b would be true
b = regex.IsMatch("asdfwwwwwwwww.zip"); // b would be true

In the first line of code, what would I replace "something" with in
order for the examples to work out?

thanks in advance,
John

i too am a beginner at regex and not at all good at it yet... :-(
the following works for all but the first test, which i don't understand why
that doesn't work too...
maybe this will help you get started?
also do you know about http://www.regular-expressions.info/tutorial.html?

namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
Regex regex = new Regex(@"(asdf).+?\.(zip)" );
bool b;
b = regex.IsMatch("asdf.zip"); // b would be true <<<<but it
fails and returns false...don't understand why
Debug.Print(b.ToString());

b = regex.IsMatch("3asdf.zip"); // b would be false
Debug.Print(b.ToString());
b = regex.IsMatch("asdf.zip3"); // b would be false
Debug.Print(b.ToString());
b = regex.IsMatch("asdf!@#$%^&*().zip"); // b would be true
Debug.Print(b.ToString());
b = regex.IsMatch("asdfwwwwwwwww.zip"); // b would be true
Debug.Print(b.ToString());
}

}
}

my (mis) understanding of this pattern is:
new Regex(@"(asdf).+?\.(zip)" );
(asdf) should match "asdf" and nothing else as the first 4 characters of the
string
..+? should match zero or more of any character
\. should match the dot character
(zip) should match zip

apparently something is wrong with my assumption about .+?
as it does not appear to match <nothing> --- thus the first test fails where
it should succeed....

hopefully one of the regex experts here will chime in with the correct
answer
mark
 
M

mp

Peter Duniho said:
[...]
my (mis) understanding of this pattern is:
new Regex(@"(asdf).+?\.(zip)" );
(asdf) should match "asdf" and nothing else as the first 4 characters of
the
string
.+? should match zero or more of any character

See http://msdn.microsoft.com/en-us/library/az24scfc.aspx#quantifiers

The quantifier +? causes the previous pattern to be matched at least once
(not "zero or more") and to match as few times as possible.

With that in mind, it should be obvious why your pattern fails to match
"asdf.zip". In particular, the + part of the quantifier is requiring at
least one character between "asdf" and ".zip"; since none is present, it
fails to match.

Note that due to that mistake in your expression, you get false positives
in your tests of the expression. That is, you get the correct, expected
return value, but for the wrong reasons. Once you have fixed the
incorrect quantifier, you'll find that "3asdf.zip" and "asdf.zip3" match,
even though they shouldn't.

Note also that it's not necessary to create match groups around the
literal text portions("asdf" and ".zip"). Even if it were necessary, I
don't see why one would separate the "zip" from the leading period.

Hope that helps.

Pete

well, then this appears to work, but probably for the wrong reasons again
:)
Regex regex = new Regex(@"\basdf.*\.zip\b" );
 
B

Big Daddy

I would think that a basic @"^asdf.*\.zip$" would work fine.  What did
you try?- Hide quoted text -

Pete, that worked like a charm. You da man!
thanks
John
 
J

Jeff Johnson

my (mis) understanding of this pattern is:
new Regex(@"(asdf).+?\.(zip)" );
(asdf) should match "asdf" and nothing else as the first 4 characters of
the string
.+? should match zero or more of any character
\. should match the dot character
(zip) should match zip

apparently something is wrong with my assumption about .+?

There's more wrong than that. The first thing you have to understand is that
a RegEx pattern will match ANY part of a string unless you take steps to
make it behave otherwise. So your RegEx would match

asdf1.zip
asdf123.zip

which you want, but it would ALSO match

zzzzzasdf1.zipqqqqq

because you haven't explicitly stated that asdf MUST come at the beginning
and .zip MUST come at the end. This is what the ^ and $ symbols do (although
the definition of "beginning" and "end" can vary based on SingleLine vs.
MultiLine in the global options--whee!).
 
A

Arne Vajhøj

I am new at regular expressions and can't get this to work quite right
for some reason. I want to check if a string matches the pattern
"asdf*.zip" where * is a wildcard and could be anything. But it
should start with "asdf" and end with ".zip" For example, this code
snippet shows what I mean:

Regex regex = new Regex(@"something");
bool b;
b = regex.IsMatch("asdf.zip"); // b would be true
b = regex.IsMatch("3asdf.zip"); // b would be false
b = regex.IsMatch("asdf.zip3"); // b would be false
b = regex.IsMatch("asdf!@#$%^&*().zip"); // b would be true
b = regex.IsMatch("asdfwwwwwwwww.zip"); // b would be true

In the first line of code, what would I replace "something" with in
order for the examples to work out?

Other have helped you with the Regex.

I would suggest that maybe:

b = Path.GetFileNameWithoutExtension(fnm).StartsWith("asdf") &&
Path.GetExtension(fnm) == ".zip";

would fit better with what you want to do.

Arne
 
J

Jeff Johnson

[...] (although
the definition of "beginning" and "end" can vary based on SingleLine vs.
MultiLine in the global options--whee!).

Er, not quite. It is correct that RegexOptions.Multiline modifies the
definition of the ^ and $ anchors. But RegexOptions.Singleline has no
effect on that, nor are those two options mutually exclusive (as the
phrase "SingleLine vs. MultiLine" implies).

Okay, so amend that to "based on whether MultiLine flag is set or not." Come
to think of it, I don't think I've ever used either of those two options
outside of testing their effect while following a tutorial (and then,
obviously, promptly forgetting what they were for).
 
J

Jeff Johnson

I would suggest that maybe:

b = Path.GetFileNameWithoutExtension(fnm).StartsWith("asdf") &&
Path.GetExtension(fnm) == ".zip";

would fit better with what you want to do.

Just to save a few nanoseconds I'd use GetFileName() so that the system
doesn't do the extra work of stripping off the extension, which is
unnecessary when all you're doing is testing the start of the string. And
remember, after BILLIONS of calls, those nanoseconds add up!
 
A

Arne Vajhøj

Just to save a few nanoseconds I'd use GetFileName() so that the system
doesn't do the extra work of stripping off the extension, which is
unnecessary when all you're doing is testing the start of the string. And
remember, after BILLIONS of calls, those nanoseconds add up!

I am not so worried about those nanoseconds - it does
not sound as something that will be called billions
of times.

But it works fine and it will save typing, so why not.

Arne
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top