On Fri, 13 Aug 2004 11:03:06 +0200, Grumble <(E-Mail Removed)> wrote:
>
>A few weeks ago, AMD published the SPECint2000 score for the FX-53:
>http://www.spec.org/cpu2000/results/...628-03181.html
>
>SPECint2000_peak = 1700
>SPECint2000_base = 1601
>
>I see that they used Intel's compiler on Windows XP Professional. Please
>correct me if I am wrong. Windows XP is a 32-bit OS, thus the benchmarks
>did not use the 8 additional general purpose registers defined in the
>x86-64 instruction set, right?
That is correct.
>I imagine that, even with 8 more registers available, gcc cannot
>outperform Intel's compiler and Microsoft libraries on integer code?
Correct again. The optimizations in GCC are not as good as those in
Intel's compiler, though the difference is generally not huge. Take a
look at the results AMD published for their 'A4800' systems. These
are a bunch of Opteron 144 (1.8GHz) processors running under a variety
of different OSes and using different compilers. The fastest results
they achieved was 1095 using Win2K3 (32-bit OS) + Intel's (32-bit)
compiler. For comparison, SuSE 8 for AMD64 (64-bit OS) + GCC 3.3
(64-bit) they managed 1045, and with SuSE 8 for x86 (32-bit OS) + GCC
3.3 for x86 (32-bit compiler) they turned in a score of 960.
So, in the end AMD showed an 8.8% improvement by going from 32 to
64-bit code, but they saw a 14% improvement going from Linux + GCC
(32-bit ) to Windows + Intel C (also 32-bit).
>I also noticed Sun's recent SPECfp2000 submission for the Opteron 150:
>http://www.spec.org/cpu2000/results/...712-03241.html
>
>SPECfp2000_peak = 1787
>SPECfp2000_base = 1637
>
>Sun did use a 64-bit OS, and it seems they compiled most benchmarks as
>64-bit applications. I imagine the compiler (most often PathScale)
>produced SIMD code to use the XMM registers?
Presumably yes, it would use SIMD code, the XMM registers and the
extra 8 integer registers (even with FP code you still need some
integer registers).
>In short, I am wondering how much improvement the 8 additional GPRs and
>8 additional media registers bring...
Usually more than enough to make up for the performance loss you would
expect with 64-bit code. Normally, if all else is equal, 64-bit code
is about 5-10% slower than 32-bit code until you blow your memory
limits, at which point 32-bit code just completely breaks down.
That's why most bi-arch systems still use lots of 32-bit applications
if they can, eg Sun's Solaris.
With AMD64 the extra registers have managed to improve the performance
enough that they not only negate this performance loss, but turn it
into a 5-10% performance gain on average. Not bad at all for a fairly
small cost in die space and virtually no changes to the instruction
set. FWIW the reason why AMD only went to 16 registers (still a
pretty low number as compared to most modern processors) is that this
is the most that they could squeeze into the x86 instruction set
without making fairly major changes (they did a pretty damn good job
of this, obviously they actually put some thought into how to extend
x86 to 64-bits as naturally as possible).
-------------
Tony Hill
hilla <underscore> 20 <at> yahoo <dot> ca