Performance in csharp, scientific simulation

  • Thread starter Thread starter Michael Gorbach
  • Start date Start date
M

Michael Gorbach

I was asked this summer to write a monte carlo code to simulation
magnetic nanoparticles. For the nonphysicists, basicly it is a
simulation where most of the time is taken up by looping through each
pair in an array of 500 or so particles, in order to calculate
interaction potential. I wrote what i have so far in csharp because i
wanted to learn it and though it would give me some good experience. I
am now beginning to understand that .net and managed code in general
lags far behind performance wise. Eventually i am probably going to
have to port to c++, unmanaged, because the simulation code will need
to run on linux as well.
My question is this:
What can i do in my chsarp code right now to speed up the performance.
The main method that is run hundreds of times a second basicly involves
calculating a vector dot product (using my own vector class), and an
exponential. Would marking the method unsafe speed anything up>
 
I believe before jumping to conclusions you must profile your application.
Aside from memory use you can find out which methods are taking most of time
/ calls and check if code is optimal in there.

As you might know, ideal performance is achieved by fetching result by
argument. Simple iteration is not always the best approach. This includes
names :-)

HTH
Alex
 
how do i go about doing this application profiling? im more or less new
to serious programming so any help would be appreaciated.
Also, what do you mean by "fetching result by argument?"

Are there are references on performance you could suggest?
 
MIchael, in most testing scenarios the performance of well written (and
I stress "well - written") C# and C++ is comparable. Both eventually
are run as machine code. In the case of C#, where intensive math
computations are being made, performance can be increased by the
judicious and careful use of pointer - based arithmetic. Alex's
comment about profiling is right on the money, especially if this is a
new language you are just learning.

Compuware has an excellent freeware profiler (a "Community" edition)
and there are others. These can help you tremendously.
 
I would suggest to have a look at math books dealing with optimal
algorithms.

My comment is about most efficient computation, which is not always
achievable. y = f(x). X is argument. F - some function. If you can deliver y
for any given x, for example, using some table, you probably won't be able
to create anything more efficient in terms of speed. But you will pay with
memory use.

That's why I suggest profiling. You can start with also free MS CLRProfiler,
which you can download at
http://www.microsoft.com/downloads/...52-d7f4-4aeb-9b7a-94635beebdda&DisplayLang=en
Source code is quite good in demonstrating some of common optimization
techniques, by the way

HTH
Alex
 
Last I checked CLR Profiler does *not* profile speed but only memory
usage. If the simulation is using fixed arrays, as I suspect, it
won't give any results.

Besides, the question was which is faster for this task -- C# or C++?
Just profiling C# evidently won't give any answer to this question.
 
If I understand you correctly, you already have a working C# program,
and you don't do any tricky stuff that might be hard to port.

So the solution is very simple: get a C++ compiler, convert your
program to C++ (or even plain C if you can turn your vector into a
simple struct), and compare the execution times for both program
versions. There you have your result -- no profiler needed.

Whether the C++ version is faster will depend mostly on how good the
C++ compiler's optimizer is. Numerical code can be optimized very
well, but .NET doesn't do any of those optimizations. This can lead
to substantial performance benefits for unmanaged code -- assuming, of
course, that your C++ compiler actually does such optimizations.

As for speeding up the C# code...

Merely *marking* a method unsafe does nothing at all. This keyword
merely *allows* you to use pointer operations which *might* be faster,
but that's not guaranteed either.

Using pointers to address array elements might speed up C#, but if
you're iterating over an array the range checks are already optimized
away by the JIT compiler. Also, make sure that C# overflow checking
is disabled -- I think it's off with /optimize but I'm not sure.

Microsoft has a free C/C++ compiler download somewhere on MSDN but
since your code must run on Linux you'll probably use gcc anyway.
 
Thanks everyone for the great responses. I love this newsgroup!
Steve, yes I do know about mono and i will use it if i dont port, but
its speed is questionable. At best, it will run as fast as microsoft
..net, at worst there will be a performance hit.
I will take a look at both profilers that have been suggested. Thanks
for the references.
My question is to Peter: What exactly is pointer-based arithmitic and
where can find algorithms/books or other help on the subject? I think
this maybe the best short term solution to my problem. Also, i may
think about using a table to look up the exponential function values.
 
In message said:
Thanks everyone for the great responses. I love this newsgroup!
Steve, yes I do know about mono and i will use it if i dont port, but
its speed is questionable. At best, it will run as fast as microsoft
.net, at worst there will be a performance hit.

Sniffing round the net, Mono currently appears to be a little slower
than Microsoft's implementation, but I'd expect the gap to close.
 
Michael Gorbach said:
Thanks everyone for the great responses. I love this newsgroup!
Steve, yes I do know about mono and i will use it if i dont port, but
its speed is questionable. At best, it will run as fast as microsoft
.net, at worst there will be a performance hit.

That depends on what you do with it. When some colleagues were
investigating performance comparisons, they found that for many things
..NET was faster than Mono, but for some others Mono was faster than
..NET.
 
I am learning C# and I am writing scientific software. What I did
find so far is that when the data is in arrays, the differences between
C# and the equivalent C++/C codes are very small.

I tried (1) a Monte-Carlo simulation, and (2) translated from C a
routine that solves a system of linear equations (simq from the CEPHES
package) and benchmarked a large system with the coefficients
initialized by random number generator. The C/C++ & C# versions ended
with the same result and nearly the same time.

However, when I translated part of the Stepanov benchmark ( a C++
benchmark that measures abstraction penalty ) I did find a major
performance hit. I suspect that the current C# optimizer is currently
not yet developed enough to deal with high level of abstraction. Of
course, I am a C# beginner and perhaps I can learn to translate better.

I think that if you are a little careful in performance-critical
parts of your code, you can achieve nearly C speeds.

Dov
 
I am learning C# and I am writing scientific software. What I did
find so far is that when the data is in arrays, the differences between

C# and the equivalent C++/C codes are very small.


I tried (1) a Monte-Carlo simulation, and (2) translated from C a
routine that solves a system of linear equations (simq from the CEPHES
package) and benchmarked a large system with the coefficients
initialized by random number generator. The C/C++ & C# versions ended
with the same result and nearly the same time.


However, when I translated part of the Stepanov benchmark ( a C++
benchmark that measures abstraction penalty ) I did find a major
performance hit. I suspect that the current C# optimizer is currently
not yet developed enough to deal with high level of abstraction. Of
course, I am a C# beginner and perhaps I can learn to translate better.



I think that if you are a little careful in performance-critical
parts of your code, you can achieve nearly C speeds.


Dov
 
dov thats a very interesting post you've made.
You said you tried a monte carlo simulation? What kind of times are you
getting? Im doing MC of hard disks (2d), and the performance is not
great ... especially for 500 particles.
 
Why don't you post your code that does your vector cross-product
calculations. Perhaps we can make specific suggestions to improve its
speed.
 
Hi Michael,

The MC I tried it on is for Lennard-Jones particles. It is more
complicated because there is some dynamics too. There are 38 particles
and about 20 time slices which makes it roughly equivalent to about
38*20 = 760 interacting particles. This is actually a simulated
annealing code, and MC on each temperature takes about 10 minutes of
CPU. Most of the computation time (nearly 80%) is spent on the
Lennard-Jones calculation.

In the C# code the coordinates are stored in a 3D jagged array
([nt][np]). nt is the time slice index, np the particle number and
i=0,1,2 for x,y or z.

In the C++ code the coordinates are encapsulated in a small class.
For example, the coordinates of N particles are stored in something
like:

class coordinates {
std::vector<double> v_;
public:
coordinates( int N ) : v_(3*N, 0.0) {}
double& operator()(int i, int j) { return v_.at(3*j+i);}
double operator()(int i, int j ) const { return v_.at(3*j+i);}
};

In my experience, trying to encapsulate the coordinates in C# in a
similar way degrades performance. So I keep them in arrays as above.

Also, what floating point optimization are using in the C++ ? There is
a compiler option /Op (called "Improved Consistency" in Visual .NET)
which is off by default ("Default Consistency" in Visual.NET). I
normally turn it on for long MC runs. Have you tried to compare timing
and results of your code with /Op on and off ?

Dov
 
My particles are stored in a class Nanoparticle, which contains a
member of a Vector class i created. The dot product operator is done by
the * operator which i overloaded for the Vector class. It simply uses
a for loop over the coordinates to do the dot product.
 
And that is probably the source of your slowdown - you have created at
least two levels of abstraction above your particles: (1) The Vector
class, and (2) overloading the Vector operations (e.g. the * operator).

Dov
 
Where performance is critical, try to use built-in data structures. For
example, if you have 3D particles use two dimensional jagged arrays
(e.g. double [][] ) to store the coordinates. The size of the first
dimension is the number of particles and of the the second dimension is
3 (for x, y, z).

Dov
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Back
Top