Slow performance after Conversion C# application to VS 2005

J

Jack

I have a C# Console application written in Visual Studio 2003
processing a lot of data. The application is therefor using .Net 1.1
and runs normally 28 minutes. After converting the project to a Visual
Studio 2005 project (same if I rebuild a new project adding the
existing class files) the same code is running for 10 hours. The only
difference should be .Net Framework 2.0 is used now instead of 1.1.

1. Who knows what causes the decreased performance?
2. Who else has the same performance issue?
2. How can I fix this issue?

Hope somebody can help me...
 
J

jan.hancic

Thats impossible to anwser if you don't tell what you are doing, what
classes you are using etc..
 
J

Jon Skeet [C# MVP]

Jack said:
I have a C# Console application written in Visual Studio 2003
processing a lot of data. The application is therefor using .Net 1.1
and runs normally 28 minutes. After converting the project to a Visual
Studio 2005 project (same if I rebuild a new project adding the
existing class files) the same code is running for 10 hours. The only
difference should be .Net Framework 2.0 is used now instead of 1.1.

1. Who knows what causes the decreased performance?
2. Who else has the same performance issue?
2. How can I fix this issue?

Hope somebody can help me...

You'll need to give more information about what it's doing - and
preferrably some analysis of where the time is being spent. Have you
run a profiler over the code, or done anything to work out where the
time is being spent?
 
J

Jack

Well, I thought there might have been an overall problem with 2.0
performance. But I think I have to look at certain aspects. My
application is processing a lot of input data we get in XML. In the
part where all data is read into memory and stored in some variables
there's a routine called ValidateRecords. Basically a lot of checks are
done here like all kinds of string compares and regular expressions. I
didn't write this code myself so I didn't know what to think of
earlier.

Now I have looked at that part nore specific and I have tested all
kinds of stuff. In general I have looked at Array.Sort and
string.Compare which cause slower performance on .Net Framework 2.0.
But those are not my problems. Currently I just achieved to define a
test which shows I do have a problem with Regular Expressions in .Net
Framework 2.0.

This is my testcode:

int testNumber = 1000;

MatchCollection matches;

for (int i = 0; i < testNumber; i++)
{
Regex RegexWord = new Regex(@"\b(\d+[\.,](\d+|-)|\w+)\b",
RegexOptions.IgnoreCase |
RegexOptions.Compiled);
matches = RegexWord.Matches("njdjjd"); // GetDuur(oRegexWord);
int count = matches.Count;
}

On .Net 1.1 this loop takes 16 milliseconds, while on .Net 2.0 this is
like 7500 milliseconds.

If I take the Regex variable declaration and instantiation out of the
loop it only takes 300 millseconds.
If I leave it in the loop but don't assign count by commenting the last
line it also takes around 300 milliseconds. Why is the combo of these 2
within the loop taking so long?

And I did take these lines from some object somebody made and these 3
lines are processed for every record.
The Regex declaration is in the constructor of the object and it's
always the same, so is making it static my solution (Don't really
understand Regex enough to know whether it is safe for me) or is there
a better one? A better one would be an alternative where I can achieve
the 15 milliseconds, so we are not losing time (20 times longer) if we
use .Net 2.0.

I also want to understand why the 3 lines all together take so long and
why it goes fast if line 1 or 3 are out of the loop. So maybe somebody
can explain it to me? I saw in the code that there are a lot more Regex
used in several ways, so totally understanding what's wrong and how to
fix it can help me completely.
 
J

Jack

I found out that using the static Regex.Matches within the loop doesn't
have this performance issue.

int testNumber = 1000;

MatchCollection matches;

for (int i = 0; i < testNumber; i++)
{
matches = Regex.Matches("njdjjd", @"\b(\d+[\.,](\d+|-)|\w+)\b",

RegexOptions.IgnoreCase |
RegexOptions.Compiled);
int count = matches.Count;
}

So there's a big difference in declaring and instantiating a Regex
variable against just using the static method.
Does anybody know why?
 
R

Renze de Waal

Op 30 Mar 2006 00:38:08 -0800 schreef Jack:
I found out that using the static Regex.Matches within the loop doesn't
have this performance issue.

int testNumber = 1000;

MatchCollection matches;

for (int i = 0; i < testNumber; i++)
{
matches = Regex.Matches("njdjjd", @"\b(\d+[\.,](\d+|-)|\w+)\b",

RegexOptions.IgnoreCase |
RegexOptions.Compiled);
int count = matches.Count;
}

So there's a big difference in declaring and instantiating a Regex
variable against just using the static method.
Does anybody know why?

Jack,

The Regex class keeps a cache of compiled regular expressions. Using the
static matches method, the regular expression is only compiled once at the
first call. The next calls will see that the regular expression is already
in the cache. If you set the regex.cachesize to 0 (this disables the cache)
your program will take a lot longer.

I cannot find it in the documentation, but if you use a RegEx instance, the
compiled expression is not cached in the class. It is just cached in the
instance. If the instance disappears, the compiled regular expression
disappears. Because of that, your code that repeatedly instantiates a RegEx
variable will cause the expression to be compiled every time.

Another solution would be to use one RegEx instance that is created outside
the loop.

Renze de Waal.
 
J

Jack

Renze,

Thanks for ur explanation. That does explain why I get the performance
issue.

But imo the caching if static is a strange bonus compared to using an
instance. But this recompiling doesn't happen when I get the Count
property right? It should be at instantiating the instance or at least
at calling the Matches method.

Anyway, I think I'll just use the static Regex.Matches method.

And yes, if I take the instance out of the loop I can use the
instance...need to rewrite some (one else's) code since the Regex is
part of a custom object.

Thanks!
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top