Comparing release builds of vc6 & vc7

B

Brian

Hi,

I have been trying to tune my vc7 compiled applications to perform at the
same or (preferably) better speed of the same vc6 application. Both
versions of my code are compiled with optimization, but the vc7 is quite
slow in comparison to the vc6 one.

My timing test was rougly 71 seconds for vc6 and 103 seconds for vc7.
The total number of calculations was roughly 191 million for vc6 and 89
million for vc7 after 13 seconds of run time.

I'm a bit confused at the results. Shouldn't the two compilers be at
least comparable?

If it matters, I'm using Visual Studio .Net Enterprise Architect 2002.

I've provided the test code, compiler options, and linker options used for
both compilers.

Compiler options for vc6:
/nologo /ML /W3 /GX /O2 /D "WIN32" /D "NDEBUG" /D "_CONSOLE" /D "_MBCS"
/Fp"Release/VSTest.pch" /Yu"stdafx.h" /Fo"Release/" /Fd"Release/" /FD /c

Linker (vc6):
kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib
shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib
kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib
shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib
/nologo /subsystem:console /incremental:no /pdb:"Release/VSTest.pdb"
/machine:I386 /out:"Release/VSTest.exe"


Compiler options for vc7:
/O2 /Ob1 /D "WIN32" /D "NDEBUG" /D "_CONSOLE" /D "_MBCS" /FD /EHsc /MT
/Yu"stdafx.h" /Fp".\Release/VSTest.pch" /Fo".\Release/" /Fd".\Release/"
/W3 /nologo /c /TP

Linker (vc7)
/OUT:".\Release/VSTest2b.exe" /INCREMENTAL:NO /NOLOGO
/PDB:".\Release/VSTest.pdb" /SUBSYSTEM:CONSOLE odbc32.lib
odbccp32.lib kernel32.lib user32.lib gdi32.lib winspool.lib
comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib
uuid.lib odbc32.lib odbccp32.lib /MACHINE:I386


Test source code (console application):

// VSTest.cpp : Defines the entry point for the console application.
//

#include "stdafx.h"


// comment out the next line to perform a looping test.
#define __USETHREAD

#include "windows.h"
#include <math.h>

#include "stdio.h"
#include "string.h"

DWORD g_TotalCalculations = 0;


DWORD __stdcall TestThreadFunction(void *ptr)
{
int num = 0;
int numMinutes = 0;

while (1)
{
Sleep(1000);

num = g_TotalCalculations;
numMinutes++;

printf("%d seconds and %d calculations\n", numMinutes, num);
}
}

int main(int argc, char* argv[])
{
int i = 0;

DWORD dwStart = GetTickCount();
srand(dwStart);

DWORD threadId = 0;


#ifdef __USETHREAD

// create a test thread
CreateThread(NULL, 0, TestThreadFunction, 0, 0, &threadId);

while (1)
#else

// set up a large for loop
for (i=0; i<1000000; i++)

#endif
{
double percentDone = (double)i / 1000000.0 * 100.0;

#ifndef __USETHREAD

// if no thread is used, set up a smaller inner loop.
for (int j=0; j<1000; j++)
{

#endif

// regardless of the method (threaded or looping)
// perform some basic math tests.
double a = 0.0;

a = cos((double)(rand() % 360)) * sin((double)(rand() % 360));
a += ((double)(rand() % 6000) / ((double)(rand() % 3000) + 1));

#ifndef __USETHREAD

} // small inner for loop

#endif

// increment the total number of calculation passes
// this is printed out by the separate thread.
g_TotalCalculations++;

#ifndef __USETHREAD

// More or less the next lines just let the user know that
// something really is happening.

int b = (int)(percentDone * 10000.0);

if (!(b % 10000))
{
// this isn't always called the way that I'd like,
// but that's not the issue.
printf("completed... (%0.2f%%)\r", percentDone);
}

#endif
}

printf("\n\nCompleted test in %d ms.\n", GetTickCount() - dwStart);

return 0;
}
 
S

Stephan Schaem

Can you try to force the /fp:fast compile flag and see what result you get?

Stephan
 
B

Brian

This compile flag isn't recognized by my compiler (cl version 13.00.9466).
cl : Command line warning D4002 : ignoring unknown option '/fp:fast'

The closest I can find is '/Fp', but that is used for specifying a
precompiled header. Is this something that I need to add as a pragma?
I've searched MSDN, but I'm not seeing this option. :(

Thanks,
Brian
 
R

Ronald Laeremans [MSFT]

It is a new option for the Whidbey version that is under development.

Ronald Laeremans
Visual C++ team
 
B

Brian

Hi Ronald,

Do you have any ideas/suggestions as to why there is such a discrepancy
between the run times? I'd hoped that maybe the newer compiler 13.10.3077
would have provided better results, but they're still not as good as VC6.
In fact, they aren't too far off from the 13.00.9466 version of the
compiler.

I'm trying to justify the upgrade to VS .Net to my managers, but unless I
can get similar if not superior results, they're going to just stick with
v6.

Thanks for your help,
Brian

Ronald Laeremans said:
It is a new option for the Whidbey version that is under development.
 
B

Brian

It's all in the switches.... My tests, as it turns out, were not fair
tests. I had inadvertently left the compiler settings for use with
single-threaded runtime (/ML) in the release options for VC6 and
multithreaded runtime (/MT) in the release options for VC7

When I set them both to /MT, the results were what I'd expect them to be.

Brian
Hi Ronald,
Do you have any ideas/suggestions as to why there is such a discrepancy
between the run times? I'd hoped that maybe the newer compiler 13.10.3077
would have provided better results, but they're still not as good as VC6.
In fact, they aren't too far off from the 13.00.9466 version of the
compiler.
I'm trying to justify the upgrade to VS .Net to my managers, but unless I
can get similar if not superior results, they're going to just stick with
v6.
Thanks for your help,
Brian
 
C

Carl Daniel [VC++ MVP]

Brian said:
It's all in the switches.... My tests, as it turns out, were not fair
tests. I had inadvertently left the compiler settings for use with
single-threaded runtime (/ML) in the release options for VC6 and
multithreaded runtime (/MT) in the release options for VC7

When I set them both to /MT, the results were what I'd expect them to
be.

The difference in performance is probably from the use of rand() in your
code. rand() maintains per-thread state when the multi-threaded runtime
library is used, while it's simply a global static variable when the single
threaded library is used.

-cd
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top