C# math functions much slower than C++ equivalents??!!

Ingmar · Aug 26, 2003

Simple comparison tests we have performed show that System.Math
functions in C# are much slower than corresponding functions in C++.
Extreme examples are System.Math.Exp() and System.Math.Tan(), which
both seem to average 50 times slower than their C++ counterparts.
(Tests are done using optimized Release configurations outside the
Visual Studio environment.)

Our explanation for this large discrepancy is that C# is an
interpreter language. Perhaps there are other explanations. In any
case, does anyone have suggestions as to how the C# math functions can
be evaluated quicker? MUCH quicker? We'd rather not have to write our
own.

Thanks in advance.

Chad Myers · Aug 26, 2003

Ingmar said:
Simple comparison tests we have performed show that System.Math
functions in C# are much slower than corresponding functions in C++.
Extreme examples are System.Math.Exp() and System.Math.Tan(), which
both seem to average 50 times slower than their C++ counterparts.
(Tests are done using optimized Release configurations outside the
Visual Studio environment.)

Our explanation for this large discrepancy is that C# is an
interpreter language. Perhaps there are other explanations. In any
case, does anyone have suggestions as to how the C# math functions can
be evaluated quicker? MUCH quicker? We'd rather not have to write our
own.

C# is just a language, you're talking about .NET.

And .NET is NOT interpreted!

Very important point!!

Anyhow, please post your code if you can, because you
might be doing something inappropriate.
-c

craig · Aug 26, 2003

are you accounting for the JIT compile time that coours the very first time
that a method is called?

memememe · Aug 26, 2003

And .NET is NOT interpreted! Very important point!!

pardon my ignorance here but I always thought .NET had a "virtual machine"
at its core, some engine that intrpreted the bytecode and managed the
garbage collection? I could be wrong, and keep in mind I come from a java
enviroment.

Chad Myers · Aug 26, 2003

memememe said:
pardon my ignorance here but I always thought .NET had a "virtual machine"
at its core, some engine that intrpreted the bytecode and managed the
garbage collection? I could be wrong, and keep in mind I come from a java
enviroment.

That's how Java started out. There was bytecode which was
interpreted at runtime into the actual instructions.

Both .NET and Java are now JIT compiled. At startup,
the necessary code is compiled on-the-fly into machine
code and executed. As new assemblies are loaded/referenced,
they will be JIT'd as well.

This is a big different than interpreted.

Java compilers produce "bytecode", .NET compilers
produce IL.

Imagine if you had a 2-3 stage C++ compiler
(pre-processor, compiler, post-processor, etc).

Java and .NET compile to an intermediate state,
and then the runtime (JavaVM, .NET CLR, etc)
will finish the compilation at runtime since it
can make a better determination of which optimizations
to make RIGHT THEN as opposed to guessing at
compile time like C/C++, Delphi, and other
old-fashioned

)) compiled languages.

It's all a matter of when/where the actual machine code
is produced. On the dev box? Or on the executors
machine as they execute the app?

-c

Robert Hooker · Aug 26, 2003

To do your tests - are you perhaps doing this:?

ArrayList list = new ArrlayList();
list.Add_10000_floats
foreach (float f in list)
{
Math.Exp(f)
}

If so - then probably 'most' of the time in the loop is not Math.Exp - but
boxing\unboxing your floats (valuetypes) from a reference type collection.

Frank X · Aug 26, 2003

Ingmar said:
Simple comparison tests we have performed show that System.Math
functions in C# are much slower than corresponding functions in C++.
Extreme examples are System.Math.Exp() and System.Math.Tan(), which
both seem to average 50 times slower than their C++ counterparts.
(Tests are done using optimized Release configurations outside the
Visual Studio environment.)

Our explanation for this large discrepancy is that C# is an
interpreter language. Perhaps there are other explanations. In any
case, does anyone have suggestions as to how the C# math functions can
be evaluated quicker? MUCH quicker? We'd rather not have to write our
own.

It will be C# that is slow not the functions.

Write your intensive math stuff, valuation models etc as a separate library
in C++.

Write the screen management, database, IO stuff in C#.

Andre · Aug 26, 2003

Frank X wrote:

...snip..
What?! C# is *not* interpretted! It's a *compiled* language! Where did
you get that bit from?

And I believe there must be something about your tests. Please share
your code with us.

-Andre

Joe Mayo · Aug 27, 2003

Hi Chad,

According to this it appears that the JITter works via methods.

http://msdn.microsoft.com/msdnmag/issues/0900/Framework/default.aspx

I also checked Jeffrey Richter's Applied Microsoft .NET Programming (page
16), which confirms that the JITter operates on methods.

Joe

craig · Aug 27, 2003

Jeffrey Richter's book rocks. Its one of my all-time favorite .NET books.
Packed with great info.

Ingmar · Aug 27, 2003

Thanks for all the replies so far. I'm particularly interested in any
possible solutions or ways 'round this problem; the issue of whether
or not C# is interpreted etc I find somewhat confusing. The best
suggestion so far (assuming we continue with C# and don't just simply
write everything in C++ instead!) is to perform all our number
crunching in DLLs not written in C#.

Of course, there is still the issue of the correctness of our test.
For this I've listed the C# and C++ codes below. They both evaluate
the functions Tan, Sin, Cos, Exp, Log and simple multiplication each 1
million times. This is repeated 100 times, and the average time
required is calculated.

******************************************
The output from the test for C# is:
******************************************
PI 3.1415926535897900000000000
100.00 run avg 1,000,000.00 x tanges 219.86 ms
100.00 run avg 1,000,000.00 x sinus 15.00 ms
100.00 run avg 1,000,000.00 x cosinus 15.16 ms
100.00 run avg 1,000,000.00 x exp 253.61 ms
100.00 run avg 1,000,000.00 x log 180.64 ms
100.00 run avg 1,000,000.00 x theta * theta 15.94 ms
Total execution time 70.04 s

******************************************
The output from the test for C++ is:
******************************************
PI 3.1415926535897931000000000
100 run avg 1000000 x tanges 4.06 ms
100 run avg 1000000 x sinus 4.06 ms
100 run avg 1000000 x cosinus 4.06 ms
100 run avg 1000000 x exp 4.07 ms
100 run avg 1000000 x log 4.21 ms
100 run avg 1000000 x theta * theta 4.07 ms
Total execution time 2.45 s

******************************************
The C# program code
******************************************
using System;

namespace CSharpSpeed
{
/// <summary>
/// Summary description for Class1.
/// </summary>
class Speed
{
/// <summary>
/// The main entry point for the application.
/// </summary>
static void Main(string[] args)
{
DateTime start, end;
DateTime tmp1, tmp2;
double tmp;
TimeSpan time;

double pi = 2 * Math.Asin(1.0);
Console.Out.WriteLine("PI {0:F25}", pi);

start = DateTime.Now;

double twoPI = Math.PI * 2.0;
double stepCnt = 1000000.0;
double step = twoPI / stepCnt;
double theta;
double runs = 100;
int cntr;

time = new TimeSpan();
for (cntr = 0; cntr < runs; cntr++)
{
tmp1 = DateTime.Now;
for (theta = 0.0; theta < twoPI; theta += step)
{
tmp = Math.Tan(theta);
}
tmp2 = DateTime.Now;
time += tmp2 - tmp1;
}
tmp = time.TotalMilliseconds / runs;
Console.Out.WriteLine("{0:F} run avg {1:N} x tanges {2:F} ms",
runs, stepCnt, tmp);

time = new TimeSpan();
for (cntr = 0; cntr < runs; cntr++)
{
tmp1 = DateTime.Now;
for (theta = 0.0; theta < twoPI; theta += step)
{
tmp = Math.Sin(theta);
}
tmp2 = DateTime.Now;
time += tmp2 - tmp1;
}
tmp = time.TotalMilliseconds / runs;
Console.Out.WriteLine("{0:N} run avg {1:N} x sinus {2:F} ms",
runs, stepCnt, tmp);

time = new TimeSpan();
for (cntr = 0; cntr < runs; cntr++)
{
tmp1 = DateTime.Now;
for (theta = 0.0; theta < twoPI; theta += step)
{
tmp = Math.Cos(theta);
}
tmp2 = DateTime.Now;
time += tmp2 - tmp1;
}
tmp = time.TotalMilliseconds / runs;
Console.Out.WriteLine("{0:N} run avg {1:N} x cosinus {2:F} ms",
runs, stepCnt, tmp);

step = 2.0 / stepCnt;

time = new TimeSpan();
for (cntr = 0; cntr < runs; cntr++)
{
tmp1 = DateTime.Now;
for (theta = 0.0; theta < 2.0; theta += step)
{
tmp = Math.Exp(theta);
}
tmp2 = DateTime.Now;
time += tmp2 - tmp1;
}
tmp = time.TotalMilliseconds / runs;
Console.Out.WriteLine("{0:N} run avg {1:N} x exp {2:F} ms",
runs, stepCnt, tmp);

time = new TimeSpan();
for (cntr = 0; cntr < runs; cntr++)
{
tmp1 = DateTime.Now;
for (theta = step; theta < 2.0; theta += step)
{
tmp = Math.Log(theta);
}
tmp2 = DateTime.Now;
time += tmp2 - tmp1;
}
tmp = time.TotalMilliseconds / runs;
Console.Out.WriteLine("{0:N} run avg {1:N} x log {2:F} ms",
runs, stepCnt, tmp);

step = 1.0;

time = new TimeSpan();
for (cntr = 0; cntr < runs; cntr++)
{
tmp1 = DateTime.Now;
for (theta = 0.0; theta < stepCnt; theta += step)
{
tmp = theta * theta;
}
tmp2 = DateTime.Now;
time += tmp2 - tmp1;
}
tmp = time.TotalMilliseconds / runs;
Console.Out.WriteLine("{0:N} run avg {1:N} x theta * theta {2:F}
ms", runs, stepCnt, tmp);

end = DateTime.Now;
time = end - start;
Console.Out.WriteLine("Total execution time {0:F} s",
time.TotalSeconds);

Console.In.ReadLine();
}
}
}

******************************************
The C++ program code
******************************************
#define _USE_MATH_DEFINES

#include <iostream>
#include <time.h>
#include <math.h>

using namespace std;

int main(int argc, char* argv[])
{
clock_t start, end;
clock_t tmp1, tmp2;
double tmp;
double time;

double pi = 2 * asin(1.0);
printf("PI %25.25f\n", pi);

start = clock();

double twoPI = M_PI * 2.0;
double stepCnt = 1000000;
double step = twoPI / stepCnt;
double theta;
double runs = 100;
int cntr;

time = 0.0;
for (cntr = 0; cntr < runs; cntr++)
{
tmp1 = clock();
for (theta = 0.0; theta < twoPI; theta += step)
{
tmp = tan(theta);
}
tmp2 = clock();
time += tmp2 - tmp1;
}
time /= runs;
printf("%.0f run avg %.0f x tanges %.2f ms\n", runs, stepCnt, time);

time = 0.0;
for (cntr = 0; cntr < runs; cntr++)
{
tmp1 = clock();
for (theta = 0.0; theta < twoPI; theta += step)
{
tmp = sin(theta);
}
tmp2 = clock();
time += tmp2 - tmp1;
}
time /= runs;
printf("%.0f run avg %.0f x sinus %.2f ms\n", runs, stepCnt, time);

time = 0.0;
for (cntr = 0; cntr < runs; cntr++)
{
tmp1 = clock();
for (theta = 0.0; theta < twoPI; theta += step)
{
tmp = cos(theta);
}
tmp2 = clock();
time += tmp2 - tmp1;
}
time /= runs;
printf("%.0f run avg %.0f x cosinus %.2f ms\n", runs, stepCnt,
time);

step = 2.0 / stepCnt;

time = 0.0;
for (cntr = 0; cntr < runs; cntr++)
{
tmp1 = clock();
for (theta = 0.0; theta < 2.0; theta += step)
{
tmp = exp(theta);
}
tmp2 = clock();
time += tmp2 - tmp1;
}
time /= runs;
printf("%.0f run avg %.0f x exp %.2f ms\n", runs, stepCnt, time);

time = 0.0;
for (cntr = 0; cntr < runs; cntr++)
{
tmp1 = clock();
for (theta = step; theta < 2.0; theta += step)
{
tmp = log(theta);
}
tmp2 = clock();
time += tmp2 - tmp1;
}
time /= runs;
printf("%.0f run avg %.0f x log %.2f ms\n", runs, stepCnt, time);

step = 1.0;

time = 0.0;
for (cntr = 0; cntr < runs; cntr++)
{
tmp1 = clock();
for (theta = 0.0; theta < stepCnt; theta += step)
{
tmp = theta * theta;
}
tmp2 = clock();
time += tmp2 - tmp1;
}
time /= runs;
printf("%.0f run avg %.0f x theta * theta %.2f ms\n", runs, stepCnt,
time);

end = clock();
time = (end - start) / 1000.0;
printf("Total execution time %.2f s\n", time);

getchar();

return 0;
}

Ingmar · Aug 27, 2003

Before anyone delves into the lengthy code I posted earlier, it seems
we have resolved the issue.

It turns out that the C++ compiler has recognized the function calls
to be redundant, and has therefore decided not to execute them. This
has been corrected, and a much more reasonable average factor of 1.6
was found (the C# function calls take approximately 1.6 times longer
than C++ function calls) averaged over the functions we considered.

For anyone interested, the results were as follows. Again, we
evaluated the functions 1 million times each, calculated the CPU time
required, and averaged this over 10 goes. The ratio of CPU times were

Tan: 1.309
Sin: 1.064
Cos: 1.056
Exp: 1.578
Log: 2.057
multiplication: 2.502

Peter N Roth · Aug 27, 2003

Wow that is very counterintuitive...

Since theta is changing at each step, and since
tan is (usually) calculated as a series sum or
as the result of a Newtons approximation,
how does the 'for' loop optimize away that
calculation?

That is more than magic...
--
Grace + Peace,
Peter N Roth
Engineering Objects International
http://engineeringobjects.com

Willy Denoyette said:
What you see here is just a piece C++ optimizer magic.
Try to run the same with all math calls commented out

for (theta = 0.0; theta < twoPI; theta += step)
{
// tmp = tan(theta);
}

and you will see the results are the same, so what you are messuring is

simply the time taken by the for loop evaluation logic.

Jon Skeet · Aug 27, 2003

Peter N Roth said:
Wow that is very counterintuitive...

Since theta is changing at each step, and since
tan is (usually) calculated as a series sum or
as the result of a Newtons approximation,
how does the 'for' loop optimize away that
calculation?

That is more than magic...

Not really - all it's got to know is that tan isn't going to have any
side effects, and the return value is never used - it's therefore a
no-op, and can be removed.

Ingmar · Aug 28, 2003

Willy Denoyette said:
What you see here is just a piece C++ optimizer magic.
Try to run the same with all math calls commented out

for (theta = 0.0; theta < twoPI; theta += step)
{
// tmp = tan(theta);
}

and you will see the results are the same, so what you are messuring is simply the time taken by the for loop evaluation logic.
Note that the C# compiler/.NET Jitter can play the same tricks....
Willy.

Thanks for that, well spotted.
We'd corrected the problem and found a more reasonable average speed
difference factor of 1.6 (haven't seen the newsgroup posting yet).

Willy Denoyette [MVP] · Aug 28, 2003

Ingmar wrote:
|| message ||| What you see here is just a piece C++ optimizer magic.
||| Try to run the same with all math calls commented out
|||
|||
||| for (theta = 0.0; theta < twoPI; theta += step)
||| {
||| // tmp = tan(theta);
||| }
|||
||| and you will see the results are the same, so what you are
||| messuring is simply the time taken by the for loop evaluation
||| logic. Note that the C# compiler/.NET Jitter can play the same
||| tricks.... :-)

||| Willy.
||
|| Thanks for that, well spotted.
|| We'd corrected the problem and found a more reasonable average speed
|| difference factor of 1.6 (haven't seen the newsgroup posting yet).

Mind to post your code?
I've done the same test and some iterations tend to be a little faster in C#, others are a little slower.

Willy.

How much is C# slower than C++?	14	Oct 2, 2005
Calling a dll using pinvoke from c# is much slower than calling it in C++	3	Jul 13, 2007
Calling SQL from code has different behaviour than calling it from MS SQL Server Management Studio	1	Jun 29, 2006
Help to interpret HD Tune results please	38	Jan 20, 2013
VPN much slower w/ wireless modem than DSL	2	Sep 30, 2005
C# project builds SLOWER in Visual Studio on faster, new workstations	16	Apr 17, 2007
Hash table with coarse integer keys	4	Jun 7, 2010
WHy is C# so much slower than c++???	12	Nov 1, 2006

C# math functions much slower than C++ equivalents??!!

Ingmar

Chad Myers

craig

memememe

Chad Myers

Robert Hooker

Frank X

Andre

Joe Mayo

craig

Ingmar

Ingmar

Peter N Roth

Jon Skeet

Ingmar

Willy Denoyette [MVP]

Ask a Question

Similar Threads