What does Excel's RAND function really do

G

Guest

I am trying to match a clients spreadsheet where they are using RAND() to
generate values for the cumulative inverse standard normal distribution.
According to Excel's built in documentation RAND returns values greater than
or equal to zero and less than one. If this was true I believe I should
eventually see a random number equal to zero be returned. If that happened
then the function NORMSINV(0) would return a value of #NUM!. This has never
happened after many millions of trials. This leads me to believe that RAND is
actually returning values between 0 and 1. Has anyone seen an Excel
documentation that confirms this is the case?

Thanks
Dave

J

JE McGimpsey

Even a few billion trials wouldn't produce more than the tiniest
fraction of the possible values that *XL* can produce between 1E-307 and
0.999999999999999 (plus zero), so I'm not sure that extrapolating from
experience is a reliable guide...

That doesn't mean that the RAND() function can produce all of those
values, of course. I've never seen any documentation of the exact
algorithm (and ever were I to see it, I tend to doubt that I could prove
whether it was inclusive of zero or not).

B

Bernie Deitrick

Using 10,000 cells with RAND() fucntion calls and calcing my worksheet 10,000 times (100,000,000
function calls), the largest and smallest values I recorded were:

0.999999990967390
0.00000000796789922929975

HTH,
Bernie
MS Excel MVP

J

joeu2004

According to Excel's built in documentation RAND returns values greater than
or equal to zero and less than one. If this was true I believe I should
eventually see a random number equal to zero be returned.

Googling for "Excel RAND algorithm" (without quotes) found the
following explanation: http://support.microsoft.com/kb/828795 .

Arguably, the MS KB description is not dispositive because there is a
material error -- at least I presume so. It states that the algorithm
returns a random number "on [0,1]" (sic). Of course, the required
result is on [0,1), not [0,1].

On the other hand, if you look at the Fortran(!) implementation, we
see that the result is mod 1.0. That should not return 1.0. So
presumably the KB notation is simply a typo -- or written by someone
who is unfamiliar with the correct mathematical syntax.

Note that the KB article states that the RAND algorithm is capable of
returning "more than 10^13" pseudo-random numbers before repeating.
That is far fewer than the number of double floating-point values in
the range [0,1). I have not analyzed the algorithm to see if it can
return any number (or at least a large number) of sets of "more than
10^13" PRNs. Assuming that it can, as to whether or not zero is
included in any one set of 10^13 PRNs, that might depend, at least, on
how the PRNG is seeded.

This KB article does not explain that. I think there is another KB
article that does; but I am not taking the time now to track it down,
if indeed it exists. Typically, software-based PRNGs are seeded by
some manipulation of the time of day measured to the resolution
supported by the operating system.

HTH.

G

Guest

The original fortran code is provided at
http://lib.stat.cmu.edu/apstat/183
In the original article, Wichmann & Hill state "The algorithm produces
numbers rectangularly distributed between 0 and 1, excluding the end points
but, on some machines, rounding errors might very occasionally produce a
value of 0 precisely." Mathematically it is equivalent to a multiplicative
generator with modulus 27,817,185,604,309 (hence 0 is excluded).

The underlying mechanism produces an integer sequence and then scales it to
U(0,1), so such rounding errors (if present) would only impact the single
value, not the sequence. Wichmann and Hill claim that the integer sequence
has a cycle length of nearly 7E12, so observing a 0 (if the Excel
implementation has that problem, which I do not know) would be extremely rare.

There was a bug in Excel 2003's original implementation that would
frequently return negative numbers, so be sure to use the latest service
patch.

Diehard is no longer the standard battery of tests for random number
generation. Newer tests do find non-random patterns from this generator,
http://docs.python.org/lib/module-random.html
but it is certainly a marked improvement over the random number generator in
previous versions of Excel.

Jerry

joeu2004 said:
According to Excel's built in documentation RAND returns values greater than
or equal to zero and less than one. If this was true I believe I should
eventually see a random number equal to zero be returned.

Googling for "Excel RAND algorithm" (without quotes) found the
following explanation: http://support.microsoft.com/kb/828795 .

Arguably, the MS KB description is not dispositive because there is a
material error -- at least I presume so. It states that the algorithm
returns a random number "on [0,1]" (sic). Of course, the required
result is on [0,1), not [0,1].

On the other hand, if you look at the Fortran(!) implementation, we
see that the result is mod 1.0. That should not return 1.0. So
presumably the KB notation is simply a typo -- or written by someone
who is unfamiliar with the correct mathematical syntax.

Note that the KB article states that the RAND algorithm is capable of
returning "more than 10^13" pseudo-random numbers before repeating.
That is far fewer than the number of double floating-point values in
the range [0,1). I have not analyzed the algorithm to see if it can
return any number (or at least a large number) of sets of "more than
10^13" PRNs. Assuming that it can, as to whether or not zero is
included in any one set of 10^13 PRNs, that might depend, at least, on
how the PRNG is seeded.

This KB article does not explain that. I think there is another KB
article that does; but I am not taking the time now to track it down,
if indeed it exists. Typically, software-based PRNGs are seeded by
some manipulation of the time of day measured to the resolution
supported by the operating system.

HTH.