C# math functions much slower than C++ equivalents??!!

I

Ingmar

Simple comparison tests we have performed show that System.Math
functions in C# are much slower than corresponding functions in C++.
Extreme examples are System.Math.Exp() and System.Math.Tan(), which
both seem to average 50 times slower than their C++ counterparts.
(Tests are done using optimized Release configurations outside the
Visual Studio environment.)

Our explanation for this large discrepancy is that C# is an
interpreter language. Perhaps there are other explanations. In any
case, does anyone have suggestions as to how the C# math functions can
be evaluated quicker? MUCH quicker? We'd rather not have to write our
own.

Thanks in advance.
 
C

Chad Myers

Ingmar said:
Simple comparison tests we have performed show that System.Math
functions in C# are much slower than corresponding functions in C++.
Extreme examples are System.Math.Exp() and System.Math.Tan(), which
both seem to average 50 times slower than their C++ counterparts.
(Tests are done using optimized Release configurations outside the
Visual Studio environment.)

Our explanation for this large discrepancy is that C# is an
interpreter language. Perhaps there are other explanations. In any
case, does anyone have suggestions as to how the C# math functions can
be evaluated quicker? MUCH quicker? We'd rather not have to write our
own.

C# is just a language, you're talking about .NET.

And .NET is NOT interpreted! :) Very important point!!

Anyhow, please post your code if you can, because you
might be doing something inappropriate.
-c
 
C

craig

are you accounting for the JIT compile time that coours the very first time
that a method is called?
 
M

memememe

And .NET is NOT interpreted! :) Very important point!!


pardon my ignorance here but I always thought .NET had a "virtual machine"
at its core, some engine that intrpreted the bytecode and managed the
garbage collection? I could be wrong, and keep in mind I come from a java
enviroment.
 
C

Chad Myers

memememe said:
pardon my ignorance here but I always thought .NET had a "virtual machine"
at its core, some engine that intrpreted the bytecode and managed the
garbage collection? I could be wrong, and keep in mind I come from a java
enviroment.

That's how Java started out. There was bytecode which was
interpreted at runtime into the actual instructions.

Both .NET and Java are now JIT compiled. At startup,
the necessary code is compiled on-the-fly into machine
code and executed. As new assemblies are loaded/referenced,
they will be JIT'd as well.

This is a big different than interpreted.

Java compilers produce "bytecode", .NET compilers
produce IL.

Imagine if you had a 2-3 stage C++ compiler
(pre-processor, compiler, post-processor, etc).

Java and .NET compile to an intermediate state,
and then the runtime (JavaVM, .NET CLR, etc)
will finish the compilation at runtime since it
can make a better determination of which optimizations
to make RIGHT THEN as opposed to guessing at
compile time like C/C++, Delphi, and other
old-fashioned :))) compiled languages.

It's all a matter of when/where the actual machine code
is produced. On the dev box? Or on the executors
machine as they execute the app?

-c
 
R

Robert Hooker

To do your tests - are you perhaps doing this:?

ArrayList list = new ArrlayList();
list.Add_10000_floats
foreach (float f in list)
{
Math.Exp(f)
}

If so - then probably 'most' of the time in the loop is not Math.Exp - but
boxing\unboxing your floats (valuetypes) from a reference type collection.
 
F

Frank X

Ingmar said:
Simple comparison tests we have performed show that System.Math
functions in C# are much slower than corresponding functions in C++.
Extreme examples are System.Math.Exp() and System.Math.Tan(), which
both seem to average 50 times slower than their C++ counterparts.
(Tests are done using optimized Release configurations outside the
Visual Studio environment.)

Our explanation for this large discrepancy is that C# is an
interpreter language. Perhaps there are other explanations. In any
case, does anyone have suggestions as to how the C# math functions can
be evaluated quicker? MUCH quicker? We'd rather not have to write our
own.

It will be C# that is slow not the functions.

Write your intensive math stuff, valuation models etc as a separate library
in C++.

Write the screen management, database, IO stuff in C#.
 
A

Andre

Frank X wrote:

...snip..
What?! C# is *not* interpretted! It's a *compiled* language! Where did
you get that bit from?

And I believe there must be something about your tests. Please share
your code with us.

-Andre
 
C

craig

Jeffrey Richter's book rocks. Its one of my all-time favorite .NET books.
Packed with great info.
 
I

Ingmar

Thanks for all the replies so far. I'm particularly interested in any
possible solutions or ways 'round this problem; the issue of whether
or not C# is interpreted etc I find somewhat confusing. The best
suggestion so far (assuming we continue with C# and don't just simply
write everything in C++ instead!) is to perform all our number
crunching in DLLs not written in C#.

Of course, there is still the issue of the correctness of our test.
For this I've listed the C# and C++ codes below. They both evaluate
the functions Tan, Sin, Cos, Exp, Log and simple multiplication each 1
million times. This is repeated 100 times, and the average time
required is calculated.

******************************************
The output from the test for C# is:
******************************************
PI 3.1415926535897900000000000
100.00 run avg 1,000,000.00 x tanges 219.86 ms
100.00 run avg 1,000,000.00 x sinus 15.00 ms
100.00 run avg 1,000,000.00 x cosinus 15.16 ms
100.00 run avg 1,000,000.00 x exp 253.61 ms
100.00 run avg 1,000,000.00 x log 180.64 ms
100.00 run avg 1,000,000.00 x theta * theta 15.94 ms
Total execution time 70.04 s

******************************************
The output from the test for C++ is:
******************************************
PI 3.1415926535897931000000000
100 run avg 1000000 x tanges 4.06 ms
100 run avg 1000000 x sinus 4.06 ms
100 run avg 1000000 x cosinus 4.06 ms
100 run avg 1000000 x exp 4.07 ms
100 run avg 1000000 x log 4.21 ms
100 run avg 1000000 x theta * theta 4.07 ms
Total execution time 2.45 s

******************************************
The C# program code
******************************************
using System;

namespace CSharpSpeed
{
/// <summary>
/// Summary description for Class1.
/// </summary>
class Speed
{
/// <summary>
/// The main entry point for the application.
/// </summary>
static void Main(string[] args)
{
DateTime start, end;
DateTime tmp1, tmp2;
double tmp;
TimeSpan time;

double pi = 2 * Math.Asin(1.0);
Console.Out.WriteLine("PI {0:F25}", pi);

start = DateTime.Now;

double twoPI = Math.PI * 2.0;
double stepCnt = 1000000.0;
double step = twoPI / stepCnt;
double theta;
double runs = 100;
int cntr;

time = new TimeSpan();
for (cntr = 0; cntr < runs; cntr++)
{
tmp1 = DateTime.Now;
for (theta = 0.0; theta < twoPI; theta += step)
{
tmp = Math.Tan(theta);
}
tmp2 = DateTime.Now;
time += tmp2 - tmp1;
}
tmp = time.TotalMilliseconds / runs;
Console.Out.WriteLine("{0:F} run avg {1:N} x tanges {2:F} ms",
runs, stepCnt, tmp);

time = new TimeSpan();
for (cntr = 0; cntr < runs; cntr++)
{
tmp1 = DateTime.Now;
for (theta = 0.0; theta < twoPI; theta += step)
{
tmp = Math.Sin(theta);
}
tmp2 = DateTime.Now;
time += tmp2 - tmp1;
}
tmp = time.TotalMilliseconds / runs;
Console.Out.WriteLine("{0:N} run avg {1:N} x sinus {2:F} ms",
runs, stepCnt, tmp);

time = new TimeSpan();
for (cntr = 0; cntr < runs; cntr++)
{
tmp1 = DateTime.Now;
for (theta = 0.0; theta < twoPI; theta += step)
{
tmp = Math.Cos(theta);
}
tmp2 = DateTime.Now;
time += tmp2 - tmp1;
}
tmp = time.TotalMilliseconds / runs;
Console.Out.WriteLine("{0:N} run avg {1:N} x cosinus {2:F} ms",
runs, stepCnt, tmp);

step = 2.0 / stepCnt;

time = new TimeSpan();
for (cntr = 0; cntr < runs; cntr++)
{
tmp1 = DateTime.Now;
for (theta = 0.0; theta < 2.0; theta += step)
{
tmp = Math.Exp(theta);
}
tmp2 = DateTime.Now;
time += tmp2 - tmp1;
}
tmp = time.TotalMilliseconds / runs;
Console.Out.WriteLine("{0:N} run avg {1:N} x exp {2:F} ms",
runs, stepCnt, tmp);

time = new TimeSpan();
for (cntr = 0; cntr < runs; cntr++)
{
tmp1 = DateTime.Now;
for (theta = step; theta < 2.0; theta += step)
{
tmp = Math.Log(theta);
}
tmp2 = DateTime.Now;
time += tmp2 - tmp1;
}
tmp = time.TotalMilliseconds / runs;
Console.Out.WriteLine("{0:N} run avg {1:N} x log {2:F} ms",
runs, stepCnt, tmp);

step = 1.0;

time = new TimeSpan();
for (cntr = 0; cntr < runs; cntr++)
{
tmp1 = DateTime.Now;
for (theta = 0.0; theta < stepCnt; theta += step)
{
tmp = theta * theta;
}
tmp2 = DateTime.Now;
time += tmp2 - tmp1;
}
tmp = time.TotalMilliseconds / runs;
Console.Out.WriteLine("{0:N} run avg {1:N} x theta * theta {2:F}
ms", runs, stepCnt, tmp);

end = DateTime.Now;
time = end - start;
Console.Out.WriteLine("Total execution time {0:F} s",
time.TotalSeconds);

Console.In.ReadLine();
}
}
}

******************************************
The C++ program code
******************************************
#define _USE_MATH_DEFINES

#include <iostream>
#include <time.h>
#include <math.h>

using namespace std;

int main(int argc, char* argv[])
{
clock_t start, end;
clock_t tmp1, tmp2;
double tmp;
double time;

double pi = 2 * asin(1.0);
printf("PI %25.25f\n", pi);

start = clock();

double twoPI = M_PI * 2.0;
double stepCnt = 1000000;
double step = twoPI / stepCnt;
double theta;
double runs = 100;
int cntr;

time = 0.0;
for (cntr = 0; cntr < runs; cntr++)
{
tmp1 = clock();
for (theta = 0.0; theta < twoPI; theta += step)
{
tmp = tan(theta);
}
tmp2 = clock();
time += tmp2 - tmp1;
}
time /= runs;
printf("%.0f run avg %.0f x tanges %.2f ms\n", runs, stepCnt, time);

time = 0.0;
for (cntr = 0; cntr < runs; cntr++)
{
tmp1 = clock();
for (theta = 0.0; theta < twoPI; theta += step)
{
tmp = sin(theta);
}
tmp2 = clock();
time += tmp2 - tmp1;
}
time /= runs;
printf("%.0f run avg %.0f x sinus %.2f ms\n", runs, stepCnt, time);

time = 0.0;
for (cntr = 0; cntr < runs; cntr++)
{
tmp1 = clock();
for (theta = 0.0; theta < twoPI; theta += step)
{
tmp = cos(theta);
}
tmp2 = clock();
time += tmp2 - tmp1;
}
time /= runs;
printf("%.0f run avg %.0f x cosinus %.2f ms\n", runs, stepCnt,
time);

step = 2.0 / stepCnt;

time = 0.0;
for (cntr = 0; cntr < runs; cntr++)
{
tmp1 = clock();
for (theta = 0.0; theta < 2.0; theta += step)
{
tmp = exp(theta);
}
tmp2 = clock();
time += tmp2 - tmp1;
}
time /= runs;
printf("%.0f run avg %.0f x exp %.2f ms\n", runs, stepCnt, time);

time = 0.0;
for (cntr = 0; cntr < runs; cntr++)
{
tmp1 = clock();
for (theta = step; theta < 2.0; theta += step)
{
tmp = log(theta);
}
tmp2 = clock();
time += tmp2 - tmp1;
}
time /= runs;
printf("%.0f run avg %.0f x log %.2f ms\n", runs, stepCnt, time);

step = 1.0;

time = 0.0;
for (cntr = 0; cntr < runs; cntr++)
{
tmp1 = clock();
for (theta = 0.0; theta < stepCnt; theta += step)
{
tmp = theta * theta;
}
tmp2 = clock();
time += tmp2 - tmp1;
}
time /= runs;
printf("%.0f run avg %.0f x theta * theta %.2f ms\n", runs, stepCnt,
time);

end = clock();
time = (end - start) / 1000.0;
printf("Total execution time %.2f s\n", time);

getchar();

return 0;
}
 
I

Ingmar

Before anyone delves into the lengthy code I posted earlier, it seems
we have resolved the issue.

It turns out that the C++ compiler has recognized the function calls
to be redundant, and has therefore decided not to execute them. This
has been corrected, and a much more reasonable average factor of 1.6
was found (the C# function calls take approximately 1.6 times longer
than C++ function calls) averaged over the functions we considered.

For anyone interested, the results were as follows. Again, we
evaluated the functions 1 million times each, calculated the CPU time
required, and averaged this over 10 goes. The ratio of CPU times were

Tan: 1.309
Sin: 1.064
Cos: 1.056
Exp: 1.578
Log: 2.057
multiplication: 2.502
 
P

Peter N Roth

Wow that is very counterintuitive...

Since theta is changing at each step, and since
tan is (usually) calculated as a series sum or
as the result of a Newtons approximation,
how does the 'for' loop optimize away that
calculation?

That is more than magic...
--
Grace + Peace,
Peter N Roth
Engineering Objects International
http://engineeringobjects.com


Willy Denoyette said:
What you see here is just a piece C++ optimizer magic.
Try to run the same with all math calls commented out


for (theta = 0.0; theta < twoPI; theta += step)
{
// tmp = tan(theta);
}

and you will see the results are the same, so what you are messuring is
simply the time taken by the for loop evaluation logic.
 
J

Jon Skeet

Peter N Roth said:
Wow that is very counterintuitive...

Since theta is changing at each step, and since
tan is (usually) calculated as a series sum or
as the result of a Newtons approximation,
how does the 'for' loop optimize away that
calculation?

That is more than magic...

Not really - all it's got to know is that tan isn't going to have any
side effects, and the return value is never used - it's therefore a
no-op, and can be removed.
 
I

Ingmar

Willy Denoyette said:
What you see here is just a piece C++ optimizer magic.
Try to run the same with all math calls commented out


for (theta = 0.0; theta < twoPI; theta += step)
{
// tmp = tan(theta);
}

and you will see the results are the same, so what you are messuring is simply the time taken by the for loop evaluation logic.
Note that the C# compiler/.NET Jitter can play the same tricks.... :)
Willy.

Thanks for that, well spotted.
We'd corrected the problem and found a more reasonable average speed
difference factor of 1.6 (haven't seen the newsgroup posting yet).
 
W

Willy Denoyette [MVP]

Ingmar wrote:
|| message ||| What you see here is just a piece C++ optimizer magic.
||| Try to run the same with all math calls commented out
|||
|||
||| for (theta = 0.0; theta < twoPI; theta += step)
||| {
||| // tmp = tan(theta);
||| }
|||
||| and you will see the results are the same, so what you are
||| messuring is simply the time taken by the for loop evaluation
||| logic. Note that the C# compiler/.NET Jitter can play the same
||| tricks.... :)
||| Willy.
||
|| Thanks for that, well spotted.
|| We'd corrected the problem and found a more reasonable average speed
|| difference factor of 1.6 (haven't seen the newsgroup posting yet).

Mind to post your code?
I've done the same test and some iterations tend to be a little faster in C#, others are a little slower.

Willy.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top