collection class performance seems SLOOOWWWWWW

I

ian.smith

I have a small sellection of code here. When the "button1_Click" is
called the performance is quite poor (~ 0.5 sec). Only thing is I am
not really doing anything!! In C++ the same thing takes ~ 0 seconds to
execute!!. The List<Pumpicle> is faster than ArrayList and I convert
the List to an array before access. It just seems to be something like
unboxing the doubles out of the Pumpicle class. Any ideas??

PumpicleContainer pc = null;

private void Form1_Load(object sender, EventArgs e)
{
pc = new PumpicleContainer();

for (int i = 0; i < 10000; i++)
{
pc.Add(new Pumpicle());
}

pc.ToArray();
}

public class DaublePoint
{
public double m_x;
public double m_y;
public double m_z;
};

public class Pumpicle
{
public DaublePoint m_pos;
public double m_mass;

public Pumpicle()
{
m_pos = new DaublePoint();
}
};

class PumpicleContainer
{
public List<Pumpicle> m_pList = new List<Pumpicle>();

Pumpicle[] m_arr = null;

public void Add(Pumpicle p)
{
m_pList.Add(p);
}

public void ToArray()
{
m_arr = m_pList.ToArray();
}

public Pumpicle GetParticle(int i)
{
return m_arr;
}
}

private void button1_Click(object sender, EventArgs e)
{
pc.ToArray();

long start = System.DateTime.Now.Ticks;

int np = 0;
for (int idx = 0; idx < 20; idx++)
{
np += 22;
for (int i = -33; i <= 33; i++)
{
for (int j = -33; j <= 33; j++)
{
for (int l = 0; l < 5; l++)
{
for (int ps = 0; ps < np; ps++)
{
Pumpicle p = pc.GetParticle(ps);

DaublePoint pt = p.m_pos;
double x = pt.m_x;
double y = pt.m_y;
double z = pt.m_z;
}
}
}
}
}

long end = System.DateTime.Now.Ticks;

MessageBox.Show("Finished = " + ((end - start) / 1e7));
}
 
J

jeremiah johnson

I notice a couple of things.

you're doing 448900 iterations through those nested for loops (and I'm
not even counting the ps < np loop - who knows what that adds on).

You're measuring time incorrectly. instead of
System.DateTime.Now.Ticks, use System.DateTime.Now in both places. Then
subtract them and arrive at a TimeSpan, then call elapsedMillis() on the
TimeSpan object. The way you measure it now, if you happen to cross a
second boundary, you could wind up with very inconsistent execution times.

DateTime then = System.DateTime.Now;
// your loop here.
DateTime now = System.DateTime.Now;
TimeSpan ts = now - then;
Console.WriteLine(ts.TotalMilliseconds);

Jeremiah
 
B

Bruce Wood

Only thing is I am not really doing anything!! In C++ the same thing takes ~ 0 seconds to execute!!.
int np = 0;
for (int idx = 0; idx < 20; idx++)
{
np += 22;
for (int i = -33; i <= 33; i++)
{
for (int j = -33; j <= 33; j++)
{
for (int l = 0; l < 5; l++)
{
for (int ps = 0; ps < np; ps++)
{
Pumpicle p = pc.GetParticle(ps);

DaublePoint pt = p.m_pos;
double x = pt.m_x;
double y = pt.m_y;
double z = pt.m_z;
}
}
}
}
}

Be very, very careful about comparisons like this! You are right: the
code isn't doing anything useful. The C++ compiler may be smart enough
to detect that only the outermost loop has any lasting effect: it
changes np. If the compiler is clever enough to know that
pc.GetParticle(ps) has no side-effects, then it may simply eliminate
the inner loops altogether.

Are you sure that in C++ pc.GetParticle is being called the number of
times you expect? Or is it never called at all?

Then again it may simply be that the C++ compiler produces much better
code. However, you can't tell just from doing a benchmark like this
without further checks.
 
J

Joanna Carter [TeamB]

<[email protected]> a écrit dans le message de (e-mail address removed)...

|I have a small sellection of code here. When the "button1_Click" is
| called the performance is quite poor (~ 0.5 sec). Only thing is I am
| not really doing anything!! In C++ the same thing takes ~ 0 seconds to
| execute!!. The List<Pumpicle> is faster than ArrayList and I convert
| the List to an array before access. It just seems to be something like
| unboxing the doubles out of the Pumpicle class. Any ideas??

1. Why convert the list to an array before access ? this takes time !!

2. You are not unboxing anything as you are holding doubles and retrieving
doubles.

Instead of your PumpicleContainer class, just use a List<Pumpicle> as it is
without any wrapper class and see if that makes any difference.

Joanna
 
I

ian.smith

1. Why convert the list to an array before access ? this takes time !!

It takes time but nothing compared to accessing the List directly
several times over. The array is faster to access than the list.
2. You are not unboxing anything as you are holding doubles and retrieving
doubles.

You are right I was suggesting something like that going on.
Instead of your PumpicleContainer class, just use a List<Pumpicle> as it is
without any wrapper class and see if that makes any difference.

I have made the ToArray function return the array and then the code
uses that. That is faster (thanks). Again Array is faster than List<>
which is way faster than ArrayList
 
I

ian.smith

Be very, very careful about comparisons like this! You are right:
the
code isn't doing anything useful. The C++ compiler may be smart enough
to detect that only the outermost loop has any lasting effect: it
changes np. If the compiler is clever enough to know that
pc.GetParticle(ps) has no side-effects, then it may simply eliminate
the inner loops altogether.

The problem is actually I am comparing C# with FORTRAN. I have ported
the code over from FORTRAN and then proceded to tell my colleges how
much faster the C# would be. Then I ran the code go my ass smacked!!
The bottleneck was in this code loop. Remember that FORTRAN was used by
dinosuars, how can it be faster than C#!!!
 
I

ian.smith

DateTime then = System.DateTime.Now;
// your loop here.
DateTime now = System.DateTime.Now;
TimeSpan ts = now - then;
Console.WriteLine(ts.TotalMilliseconds);

With the world cup on I would say "back of the net". Completly correct
:). The timing was meant to be figurative but it pays to be precise.
Thanks.
 
J

Jon Skeet [C# MVP]

It takes time but nothing compared to accessing the List directly
several times over. The array is faster to access than the list.

If you rearranged your loops to do what it currently the inner loop as
the outer loop, you'd be accessing the list much, much, much less
often, however. Of course, it's difficult to know whether or not that's
feasible when your code doesn't do anything useful with idx, i, j, or l
though.
 
B

Bruce Wood

The problem is actually I am comparing C# with FORTRAN. I have ported
the code over from FORTRAN and then proceded to tell my colleges how
much faster the C# would be. Then I ran the code go my ass smacked!!
The bottleneck was in this code loop. Remember that FORTRAN was used by
dinosuars, how can it be faster than C#!!!

It wouldn't surprise me at all that FORTRAN / COBOL / C might be
_faster_ than C#. If anyone claimed to me that C# is _faster_ then I
would immediately be suspicious.

C# comes with tremendous advances in the ability to organize code (O-O
/ visual designers, etc), advances in security, interoperability with
databases, etc.... but speed? No, I don't think so.

If we're writing a sophisticated multi-cultural / multi-language
WinForms application that calls Web Services or reads from a database,
and I'm using C# and you're using C, I'm sure that I could _write_ the
code much, much faster than you could, but would my code _run_ faster
in the end? Probably after my two weeks of coding and your ten months
of coding, your code would probably run faster, but that's not the
"why" of using C# in that scenario, is it?
 
I

ian.smith

If you rearranged your loops

Thanks, but that wasn't the point of the exercise. It is to demonstrate
how seemingly inefficient C# can be at times especially concerning the
collection classes.
 
I

ian.smith

If anyone claimed to me that C# is _faster_ then I
would immediately be suspicious.

Many times C# IS faster. Remember that the IL is compiled (on the fly)
down to specific native code. This isn't for a generic x86 processor,
but for your processor (with all extensions). I have done many
comparisons and sometimes C# actually beats C++ for performance. If you
look at the example code, I create in 10000 objects and put them into
the container. Raise that up to 100000 and you would have no probs. Do
the same in C++ and you would want to do a SetSize() before you tried
otherwise you may be waiting some time.
 
J

Jon Skeet [C# MVP]

Thanks, but that wasn't the point of the exercise. It is to demonstrate
how seemingly inefficient C# can be at times especially concerning the
collection classes.

But the point is that in the real world, such efficiency is very rarely
an issue. Yes, list access can be slower than array access. *Most* (not
all, but most) applications aren't going to have performance
bottlenecks due to this. They'll probably have bottlenecks due to
database access, IO, etc - or architectural issues which can be solved
by doing less work rather than doing the same work more efficiently.
Where there *are* problems like this, they can often be worked round
with micro-optimisation (such as rearranging loops) where it's been
proven to be a problem.

If I set up a microbenchmark to test simple object construction on the
heap, I suspect C# would beat C++ due to the way the managed heap works
- does that mean C++ object creation is "SLOOOWWWWWW"? No - it just
means that different environments have different strong points. Most of
the time, these strong points won't be a significant issue.
 
X

Xavier Jorge Cerdá

I agree, and in addition, I want to highlight that the IL code is compiled
only the first time that you use, so the next use of this part of the IL
code don't need to recompile and waste this time another time

Xavier Jorge
 
B

Bruce Wood

All well and good. Nonetheless, I wouldn't _choose_ C# for its speed.
It has lots of other wonderful advantages, and it may run faster
sometimes, but when I think of reasons why I would want to use C#,
speed is not the first thing that comes to mind.

Don't forget that even if the JITter generates better code, you have to
contend with the garbage collecter, which can kick in any time it likes
and there goes a bunch of that time you may have gained with quicker
code.

If I were worried about raw speed, I probably wouldn't choose a garbage
collected language.
 
J

Jon Skeet [C# MVP]

Bruce said:
All well and good. Nonetheless, I wouldn't _choose_ C# for its speed.
It has lots of other wonderful advantages, and it may run faster
sometimes, but when I think of reasons why I would want to use C#,
speed is not the first thing that comes to mind.

Don't forget that even if the JITter generates better code, you have to
contend with the garbage collecter, which can kick in any time it likes
and there goes a bunch of that time you may have gained with quicker
code.

If I were worried about raw speed, I probably wouldn't choose a garbage
collected language.

Well, garbage collection incurs performance hits at unpredictable times
- but it can often actually give a "win" in terms of overall
performance compared with explicit malloc/free. The managed heap allows
for incredibly quick allocation, and if a whole block of objects can be
freed in one go, that can be quicker than lots of calls to free.

To my mind, the problem with garbage collection isn't the overall speed
but the fact that it can occasionally "stop the world" for significant
amounts of time.

Jon
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top