regex

  • Thread starter Thread starter ohmmega
  • Start date Start date
O

ohmmega

hy,

i've got a simple question (for somebody who already knows the answer)
about regex:
i've a string like bla@bla@bla or bla@@bla
i like to check the @'s, but couldn't figure it out how to set zero or
more char's. (zero or one was easy).

thank's
rené
 
rene,

Are you sure that a regular expression is the best option here? Why not
just call the Split method, passing the '@' character as the delimiter?


--
- Nicholas Paldino [.NET/C# MVP]
- (e-mail address removed)

hy,

i've got a simple question (for somebody who already knows the answer)
about regex:
i've a string like bla@bla@bla or bla@@bla
i like to check the @'s, but couldn't figure it out how to set zero or
more char's. (zero or one was easy).

thank's
rené
 
Hello ohmmega,
hy,

i've got a simple question (for somebody who already knows the answer)
about regex:
i've a string like bla@bla@bla or bla@@bla
i like to check the @'s, but couldn't figure it out how to set zero or
more char's. (zero or one was easy).
thank's
rené

What do you mean by 'I like to check the @'s'

Going from the fact that the rest of that sentence continues about soemthing
that sounds like a quantifier here's a short overview of the different quantifiers
available:
- ? - Zero or One
- * - Zero or More
- + - One or more
- {0,n} - Zero to n
- {n,} - n or more
- {n,m} - n to m

Take your pick ;)

Other than this being a question about regular expressions, you've not explained
what you wan to do with the end result. Regex is a pretty expensive tool
to use in terms of cpu power and in some scenario's memory consumption. Are
you sure it's the tool for the job? If you could explain a little about what
you're trying to achieve, we could potentially help you with a better solution.

Jesse
 
Hi,

I'm not sure what you mean with "check", what do you want to do when you
find a @ ?



hy,

i've got a simple question (for somebody who already knows the answer)
about regex:
i've a string like bla@bla@bla or bla@@bla
i like to check the @'s, but couldn't figure it out how to set zero or
more char's. (zero or one was easy).

thank's
rené
 
i need to know if there are exactly 5 @'s with or without text in
beetween.
i thought compiled regex would be faster than splitting and .length.
nethertheless, if you guy's say "OH NO!!!", i've no reason to demand
on it.
 
i need to know if there are exactly 5 @'s with or without text in
beetween.
i thought compiled regex would be faster than splitting and .length.
nethertheless, if you guy's say "OH NO!!!", i've no reason to demand
on it.

How much do you care about performance in this case? How often are you
likely to call this? Have you measured the performance of Split and
found that it doesn't meet your requirements?

Jon
 
How much do you care about performance in this case? How often are you
likely to call this? Have you measured the performance of Split and
found that it doesn't meet your requirements?

Jon

i need this about 200 times in a time critical application, so i just
want to have the best option.
i've not measured the time yet, but that's a good point - i will try
this next.

thanks so far
rené
 
Hello ohmmega,
i need this about 200 times in a time critical application, so i just
want to have the best option.
i've not measured the time yet, but that's a good point - i will try
this next.
thanks so far
rené

If you still want to go the regex way, you basically have two options:

See if you can find a match for this:
^[^@]*(@[^@]*){5}$

Or do a Regex.Replace and replace everything that't not a @ with nothing
and measure the length of the text afterwards:

Regex.Replace(inputstring, "[^@]", "").Length > 5

When using a regular expression, make sure you're usign a static instance
with the option RegexOption.Compiled set for performance reasons.

Like this

private static Regex rx = new Regex(pattern, RegexOptions.Compiled);

then reference this instance when using the expression.

Also add a static constructor to the class which calls rx.Match("");, that
way your performance needy code will not suffer the recompilation of the
regex.
You can also use a tool like The Regulator to generate an Assembly with the
compiled regex in there. This would give you the performance boost of not
having the regex compiled from the executable at all.

Even though this is a nice excercise in Regular expressions, I think that
simple string manupulations would be much faster...

public bool HasFiveAts(string input)
{
int count = 0;
foreach(char c in inputstring)
{
if (c == '@') { count++; }
// might even test for if (count > 5) {return false;}, but you'd have
to test that for performance
}
return count == 5
}
 
A string is an array of char. I believe the fastedst way would be to loop
through the chars in the string, and count the '@' chars. Once you reach 5,
you return true. If you reach the end of the string without reaching 5, you
return false.

--
HTH,

Kevin Spencer
Microsoft MVP

Printing Components, Email Components,
FTP Client Classes, Enhanced Data Controls, much more.
DSI PrintManager, Miradyne Component Libraries:
http://www.miradyne.net
 
A string is an array of char. I believe the fastedst way would be to loop
through the chars in the string, and count the '@' chars. Once you reach 5,
you return true. If you reach the end of the string without reaching 5, you
return false.

Just to be clear: a string *isn't* an array of chars. It's a sequence
of characters, and there's a readonly indexer (as well as implementing
IEnumerable<char>) but it's not actually an array.

For example, you couldn't pass a string as an argument to a method
with a System.Array parameter.

Jon
 
i need this about 200 times in a time critical application, so i just
want to have the best option.

200 times in what space of time?
i've not measured the time yet, but that's a good point - i will try
this next.

That's definitely worth doing. In a little test application I wrote,
it took less than a second to check 2 *miilion* strings on my laptop.
Now, they were fairly short strings - you should test with real data -
but it's an indication.

Jon
 
Correct as usual, Jon. I should have said "a string encapsulates an array of
char, and can be treated just like one."

--
HTH,

Kevin Spencer
Microsoft MVP

Printing Components, Email Components,
FTP Client Classes, Enhanced Data Controls, much more.
DSI PrintManager, Miradyne Component Libraries:
http://www.miradyne.net
 
Correct as usual, Jon. I should have said "a string encapsulates an array of
char, and can be treated just like one."

I think I'd still have quibbled: internally there isn't an array of
chars (unlike in Java, for instance) and you can't treat it as an
array of chars because you can always modify the contents of arrays,
whereas the string indexer is readonly.

There's always room for pedantry :)

Jon
 
Hi Jon,

I suppose pedantry has its place. These details are always useful to know.
In my case, I was simply talking about looping through the characters in the
string as an efficient way of counting the occurrences of a character.
However, it might have led to someone getting the wrong impression
concerning what ways one might be able to treat a string as if it were an
array of char. So, I don't take offense. I do think my solution was probably
the most efficient method for achieving the OP's goal, though. I hope that
the original problem has been solved.

--
HTH,

Kevin Spencer
Microsoft MVP

DSI PrintManager, Miradyne Component Libraries:
http://www.miradyne.net
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Back
Top