"Martin Brown" <|||newspam|||@nezumi.demon.co.uk> wrote:

> provided that the compiler doesn't get too clever
__Of_course__, all of my comments assume that compile-time optimizations are

avoided.

That goes without saying, if our intent is to test the FDIV defect or the

effect of difference precision modes (_PC_64, _PC_53, _PC_24).

(Although it was useful to Lynn that you discovered that his compiler is

optimizing the code.)

"Martin Brown" <|||newspam|||@nezumi.demon.co.uk> wrote:

> On 05/04/2012 21:06, joeu2004 wrote [in response to Lynn]:

>> Your original test design was incorrect;

>> it worked only by coincidence.

>

> Strictly it worked OK in a standard Fortran REAL*8

> aka 53bit mantissa environment [...]. The store to

> an 8 byte real is harmless there.

>

> However, everything breaks if the arithmetic 64bit

> mantissa but only 53 bits stored the result ends up

> different in the least significant bit.
When I said that Lynn's original chptst implementation was incorrect, I was

referring to the rounding of the __intermediate__ result (only the division)

to 64-bit representation (53-bit mantissa)

For the FDIV defect, if x is 4195835 and y is 3145727, then (x/y)*y-x is

exactly zero whether all calculation is done __consistently__ using 64-bit

or using 80-bit representation (the latter with a 64-bit mantissas). Of

course, the final result is stored with 64-bit representation.

(And again, __of_course__ I am only talking about the case when compile-time

optimizations are avoided.)

But instead of a single expression, Lynn[*] implemented this as div = x/y,

then chptst = div*y-x.

That worked (i.e. chptst is exactly zero) for him in the past only because

apparently he was using 64-bit representation __consistently__.

It failed to work (i.e. chptst is not exactly zero) when Excel VBA used

80-bit representation, but only because Lynn is not using 80-bit

representation __consistently__.

Indeed, it is the store into div, with only 53-bit precision, that causes

the problem. It is not "harmless" in this particular case.

That is what I meant when I said it worked (in the past) only by

coincidence. The "coincidence" was that he was using __consistent__

precision in the past, namely 64-bit representation (53-bit precision).

"Martin Brown" <|||newspam|||@nezumi.demon.co.uk> wrote:

> This should not matter at all unless your algorithm

> is extremely sensitive to roundoff - you still have

> almost 16 good decimal digits.
It is not clear to me what your point is.

But in a previous response, I alluded to the fact that _in_general__, we

cannot expect (x/y)*y-x to be exactly zero, even when we use __consistent__

precision for the calculation.

(And again, __of_course__ I am only talking about the case when compile-time

optimizations are avoided.)

Here are some random examples, all using x and y with the same absolute and

relative magnitudes as Nicely's example for the FDIV defect. By that, I

mean: x and y are 7-digit numbers; and 1 < INT(x/y) < 2.

80bit 64bit mixMode x y

=0 =0 <>0 4195835 3145727 (FDIV bug)

=0 =0 =0 1300666 1233646

=0 <>0 <>0 1695563 1538366

=0 <>0 =0 none

<>0 =0 <>0 1923832 1204810

<>0 =0 =0 none

<>0 <>0 <>0 1867447 1462980

<>0 <>0 =0 none

-----[*] I don't know if Lynn invented the div/chptst implementation or he copied

it from some reference. When I asked him for a reference, he pointed me

only to the wiki FDIV bug page,

http://en.wikipedia.org/wiki/Pentium_FDIV_bug.

I do not see Lynn's div/chptst implementation on that page or on any

referenced page.

On the contrary, I do find Nicely's original pentbug.c file, which contains

a __consistent__ implementation using 64-bit representation. Excerpted:

double lfNum1=4195835.0, lfNum2=4195835.0, lfDenom1=3145727.0,

lfDenom2=3145727.0, lfQuot, lfProd, lfZero;

/* The duplicate variables are used in an effort to foil compiler

optimization and compile-time evaluation of numeric literals. */

lfQuot=lfNum1/lfDenom1;

lfProd=lfDenom2*lfQuot;

lfZero=lfNum2 - lfProd;

Aside #1.... Note that Nicely went out of his way to avoid compile-time

optimizations. Or so he hoped ;-).

If Lynn had followed Nicely's implementation correctly, he wouldn't have had

any problem with the FDIV constants despite the change in precision mode

between Watcom languages and Excel VBA and despite the Watcom compile-time

optimizations.

Aside #2.... IMHO, Nicely was wrong to use 64-bit representation to

demonstrate the FDIV bug. Luckily, it worked, but only because the FDIV

error was so large, about 8E-5. He might have overlooked more subtle errors

due to rounding the intermediate 80-bit results to 64-bit representation.