VC++ 7.1 optimizer bug

W

William M. Miller

We've recently encountered a bug in the optimization of
floating point computations inside loops. To summarize, the
optimizer reorders floating point operations in a fashion that
is not permitted by the C++ Standard; even worse, when the
optimizer unrolls a loop, the operations are ordered
differently in different iterations. (This occurs in version
13.10.3077.)

Here's a sample program simplified from production code:

#pragma optimize("g",on)

int main(void) {
extern double hB[];
double x[4];
double f0[4];
double f1[4];
double f2[4];

int i;
for (int p = 0; p < 1; p++)
{
for (i = 0; i < 4; i++) {
x = f0*hB[0] + f1*hB[1] + f2*hB[2] ;
}
if ( (x[0] != x[2]) || (x[1] != x[3])) {
return 1;
}
}
return 0;
}

The problem appears in the computation of x. Here's the assembly
code for that loop:

fld QWORD PTR ?hB@@3PANA
xor ecx, ecx
fmul QWORD PTR _f0$[ebp]
fld QWORD PTR ?hB@@3PANA+8
fmul QWORD PTR _f1$[ebp]
faddp ST(1), ST(0)
fld QWORD PTR ?hB@@3PANA+16
fmul QWORD PTR _f2$[ebp]
faddp ST(1), ST(0)
fld QWORD PTR ?hB@@3PANA
fmul QWORD PTR _f0$[ebp+8]
fld QWORD PTR ?hB@@3PANA+8
fmul QWORD PTR _f1$[ebp+8]
faddp ST(1), ST(0)
fld QWORD PTR ?hB@@3PANA+16
fmul QWORD PTR _f2$[ebp+8]
faddp ST(1), ST(0)
fld QWORD PTR ?hB@@3PANA
fmul QWORD PTR _f0$[ebp+16]
fld QWORD PTR ?hB@@3PANA+8
fmul QWORD PTR _f1$[ebp+16]
faddp ST(1), ST(0)
fld QWORD PTR ?hB@@3PANA+16
fmul QWORD PTR _f2$[ebp+16]
faddp ST(1), ST(0)
fld QWORD PTR _f2$[ebp+24]
fmul QWORD PTR ?hB@@3PANA+16
fld QWORD PTR _f1$[ebp+24]
fmul QWORD PTR ?hB@@3PANA+8
faddp ST(1), ST(0)
fld QWORD PTR _f0$[ebp+24]
fmul QWORD PTR ?hB@@3PANA
faddp ST(1), ST(0)

Note that for the first three iterations of the unrolled loop,
the sum "f0*hB[0] + f1*hB[1]" is first computed, then
the result of "f2*hB[2]" is added. In the fourth iteration,
however, the sum "f2*hB[2] + f1*hB[1]" is computed and
then "f0*hB[0]" is added.

Because of the effects of rounding, floating point calculations
are sensitive to the order in which operations are performed;
as noted in the C89 Standard, on which the C++ Standard was
based, floating point addition and multiplication are not
associative operations. The C++ Standard (1.9 para 15) says
that "operators can be regrouped according to the usual
mathematical rules only where the operators really are
associative or commutative."

Clause 5 para 4 says, "Except where noted, the order of
evaluation of operands of individual operators and
subexpressions of individual expressions and the order in
which side effects take place, is unspecified." However, that
doesn't apply to something like "a + b + c", as here. Because
additive operators group left-to-right (5.7 para 1), the meaning
of "a + b + c" is "(a + b) + c", so "b + c" is _not_ a
subexpression of "a + b + c" -- no reordering of the "+"
operators is permitted. The violation of this constraint is
especially pernicious when it's inconsistently applied across
unrolled loop iterations.

Interestingly, if the implicit grouping of the operators is made
explicit by adding parentheses, the optimizer does not reorder
the expression, even though there is no semantic difference
between "a + b + c" and "(a + b) + c".

-- William M. Miller
The MathWorks, Inc.
 
W

William M. Miller

William M. Miller said:
We've recently encountered a bug in the optimization of
floating point computations inside loops. To summarize, the
optimizer reorders floating point operations in a fashion that
is not permitted by the C++ Standard; even worse, when the
optimizer unrolls a loop, the operations are ordered
differently in different iterations. (This occurs in version
13.10.3077.)

So, is this a known bug? Is it still present in the Whidbey
alpha release? If so, could one of the MVPs report it to
Microsoft? Or should I do so myself?

Thanks.

-- William M. Miller
The MathWorks
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top