Adding Random Numbers

G

Gary''s Student

I need to add up 23 independent random numbers. My first try was:

=RAND()*23

This proved to be insufficient since it really only added the same random
value 23 times. I can get a correct result with a helper column or with a
formula like:

=RAND()+RAND()+RAND()+RAND()+RAND()+RAND()………….

Is there a more compact array formula, or some form of SUMPRODUCT, that can
get the same result?
 
T

T. Valko

Here's a long shot:

Array entered:

=SUM(RAND()*(ROW(INDIRECT("1:23"))/100))

The RAND's really aren't rand since you're manipulating them as such:
*(ROW(INDIRECT("1:23"))/100)

Like I said, long shot!
 
G

Gary''s Student

Thanks for the try, Biff. The output is still not quite right. I played
with the divisor and can change the distribution of results, but it still
remains flat.

I do appreciate the effort, however.
 
H

Harlan Grove

Gary''s Student said:
I need to add up 23 independent random numbers. My first try was:

=RAND()*23

This proved to be insufficient since it really only added the same
random value 23 times. . . .

There's no built-in way to do this. You could try using Longre's
MOREFUNC.XLL add-in, which provides a function named MRAND which you
could use in formulas like

=SUM(MRAND(23,0,65535))/65535
 
J

joeu2004

I need to add up 23 independent random numbers.

You are not very specific. From your example, it appears that you
want 23 random values between 0 and 1.

Is there a more compact array formula, or some form of SUMPRODUCT, that can
get the same result?

Not exactly the same, but I wonder if the following would satisfy your
needs.

=SUMPRODUCT(RANDBETWEEN(0,ROW(A65512:A65534))/ROW(A65513:A65535))

With Excel 2007 (I have Excel 2003), you should be able to substitute
65535 (and its derivatives; e.g., 65513 is 65535 - 22) with a larger
number. That should improve the random characteristics somewhat.


----- original posting -----
 
J

joeu2004

You could try using Longre's MOREFUNC.XLL add-in, which
provides a function named MRAND which you could use in
formulas like
=SUM(MRAND(23,0,65535))/65535

Interesting. I'm not familiar with this function, but I wonder if it
is the same idea as the formula that I suggested using RANDBETWEEN.
Talk about reinventing the wheel!
 
G

Gary''s Student

Thanks once again Harlan. I will try your suggestion. In the mean time I am
using:

=RAND()+RAND()+RAND()+RAND()+RAND()+RAND()+RAND()+RAND()+RAND()+RAND()+RAND()+RAND()+RAND()+RAND()+RAND()+RAND()+RAND()+RAND()+RAND()+RAND()+RAND()+RAND()+RAND()

I will continue to look for a "better" formula.

b.t.w:

I don’t know many people who value quality over expediency. You,
fortunately, are one of them.
 
H

Harlan Grove

Gary''s Student said:
. . . In the mean time I am using:

=RAND()+RAND()+RAND()+RAND()+RAND()+RAND()+RAND()+RAND()
+RAND()+RAND()+RAND()+RAND()+RAND()+RAND()+RAND()+RAND()
+RAND()+RAND()+RAND()+RAND()+RAND()+RAND()+RAND()

I will continue to look for a "better" formula.
....

Actually this isn't such a bad formula. And it's not all that
difficult to produce.

Type the following formula WITHOUT PRESSING ENTER.

=SUBSTITUTE(REPT("+RAND()",23),"+","=",1)

Press [F9] then press [Enter].
 
T

T. Valko

Excel evaluates RAND() only once.

Yeah, that's why I prefaced with:
Here's a long shot

This is pretty slick:
=SUBSTITUTE(REPT("+RAND()",23),"+","=",1)
Press [F9] then press [Enter].

I briefly thought about something along those lines but didn't actually try
it.
 
G

Gary''s Student

I tried your formula. The values are random, but the distribution is
uniform. That is, they are uniformly spread between 0 and 23. About the same
number of values occur between 3 and 4 as occur between 7 and 8, etc.

Adding a bunch of RAND() together seems to give a normal distribution.
 
J

joeu2004

I tried your formula. The values are random, but the distribution is uniform.

A fact that violates the Central Limit Theorem. But my observation
agrees with yours. Based on that, I would conclude that the 23 values
in the SUMPRODUCT argument are not independent and/or not random.

Aha! The Excel 2003 Evaluate Formula feature under Formula Auditing
confirms my suspicion: RANDBETWEEN seems to be computed only once
because the argument ROW($A$65512:$A$65534) is returning a single
integer (65512), not the array {65512;65512;...;65534} that I
expected.

(BTW, note that the relative references in my previous response should
be absolute references to allow us to copy the formula into many cells
-- I used 1000 -- to analyze the distribution.)

Sigh, I often have trouble coercing Excel to generate arrays when I
expect them. I also have trouble getting array formulas to work
consistently, i.e. when I expect them to work. For example, entering
my original formula as an array formula does not work (for me), even
if I change SUMPRODUCT to SUM.

It would be great if some Excel guru could explain why my formula is
not behaving as I expect -- and more importantly, if there is
reasonable way to make the formula behave the way that I expect.

I do not expect it to behave as well as summing 23 RAND() calls
anyway. But you said you were looking for something more compact, and
you did not say if/how concerned you are about the distribution.
(Other than the fact that it should work, of course. I take that
granted.)

I'm glad you checked the distribution. I had intended to do that
myself, but I forgot.


----- original posting -----
 
G

Gary''s Student

In any case, thanks for your support on this issue. At least I have a
formula that works (even though it isn't very pretty). I will take a look
at BINOMDIST() or one of its relatives. Maybe simulating the sum of RANDs
with a formula that generates a value from a Normal or Binomial distribution
is "good enough".

Thanks again!
--
Gary''s Student - gsnu200791


joeu2004 said:
I tried your formula. The values are random, but the distribution is uniform.

A fact that violates the Central Limit Theorem. But my observation
agrees with yours. Based on that, I would conclude that the 23 values
in the SUMPRODUCT argument are not independent and/or not random.

Aha! The Excel 2003 Evaluate Formula feature under Formula Auditing
confirms my suspicion: RANDBETWEEN seems to be computed only once
because the argument ROW($A$65512:$A$65534) is returning a single
integer (65512), not the array {65512;65512;...;65534} that I
expected.

(BTW, note that the relative references in my previous response should
be absolute references to allow us to copy the formula into many cells
-- I used 1000 -- to analyze the distribution.)

Sigh, I often have trouble coercing Excel to generate arrays when I
expect them. I also have trouble getting array formulas to work
consistently, i.e. when I expect them to work. For example, entering
my original formula as an array formula does not work (for me), even
if I change SUMPRODUCT to SUM.

It would be great if some Excel guru could explain why my formula is
not behaving as I expect -- and more importantly, if there is
reasonable way to make the formula behave the way that I expect.

I do not expect it to behave as well as summing 23 RAND() calls
anyway. But you said you were looking for something more compact, and
you did not say if/how concerned you are about the distribution.
(Other than the fact that it should work, of course. I take that
granted.)

I'm glad you checked the distribution. I had intended to do that
myself, but I forgot.


----- original posting -----

I tried your formula. The values are random, but the distribution is
uniform. That is, they are uniformly spread between 0 and 23. About the same
number of values occur between 3 and 4 as occur between 7 and 8, etc.

Adding a bunch of RAND() together seems to give a normal distribution.
--
Gary''s Student - gsnu200791



joeu2004 said:
On Jun 13, 5:57 am, Gary''s Student
I need to add up 23 independent random numbers.
You are not very specific. From your example, it appears that you
want 23 random values between 0 and 1.
Is there a more compact array formula, or some form of SUMPRODUCT, that can
get the same result?
Not exactly the same, but I wonder if the following would satisfy your
needs.

With Excel 2007 (I have Excel 2003), you should be able to substitute
65535 (and its derivatives; e.g., 65513 is 65535 - 22) with a larger
number. That should improve the random characteristics somewhat.
----- original posting -----
On Jun 13, 5:57 am, Gary''s Student
I need to add up 23 independent random numbers. My first try was:

This proved to be insufficient since it really only added the same random
value 23 times. I can get a correct result with a helper column or with a
formula like:

Is there a more compact array formula, or some form of SUMPRODUCT, that can
get the same result?
 
J

joeu2004

Maybe simulating the sum of RANDs with a formula that
generates a value from a Normal or Binomial distribution
is "good enough".

The Central Limit Theory says that the sum of independent random
variables (all with the same distribution) is a normal distribution.

You never said why you are interested in the sum of 23 random values
between 0 and 1. For example, if it were to demonstrate the CLT, it
would not be appropriate to use the CLT. Otherwise, the following is
a convenient way to generate one sum of 23 random values between 0 and
1:

=norminv(rand(), 23/2, sqrt(23/12))

The theoretical mean is 23/2 (11.5), and the theoretical std dev is
sqrt(23/12) (about 1.38).

Indeed, when I generate 1000 samples using the rand()+...+rand()
formula (thanks to Harlan for providing an easy generating formula),
the mean is about 11.39 and the std dev is about 1.40.

And when I generate 1000 samples using the norminv(rand(),...)
formula, the mean is about 11.54 and the std dev is 1.37.

A graph of each 1000-sample set shows a "nearly normal" distribution.

Of course, YMMV. But it should be too different.


----- original posting -----
 
T

T. Valko

generate arrays when I expect them.
explain why my formula is not behaving as I expect --
and more importantly, if there is reasonable way to
make the formula behave the way that I expect.

ROW(...) is generating an array *but* RANDBETWEEN won't accept the array and
only takes the first element of the array for the argument.

So, you're getting:

RANDBETWEEN(0,65512)

As Harlan noted in his reply to me, RAND, and in this case RANDBETWEEN,
don't return an array. All we've done is take a single random number and
manipulate it 23 times resulting in 23 values that follow a linear trend.


--
Biff
Microsoft Excel MVP


I tried your formula. The values are random, but the distribution is
uniform.

A fact that violates the Central Limit Theorem. But my observation
agrees with yours. Based on that, I would conclude that the 23 values
in the SUMPRODUCT argument are not independent and/or not random.

Aha! The Excel 2003 Evaluate Formula feature under Formula Auditing
confirms my suspicion: RANDBETWEEN seems to be computed only once
because the argument ROW($A$65512:$A$65534) is returning a single
integer (65512), not the array {65512;65512;...;65534} that I
expected.

(BTW, note that the relative references in my previous response should
be absolute references to allow us to copy the formula into many cells
-- I used 1000 -- to analyze the distribution.)

Sigh, I often have trouble coercing Excel to generate arrays when I
expect them. I also have trouble getting array formulas to work
consistently, i.e. when I expect them to work. For example, entering
my original formula as an array formula does not work (for me), even
if I change SUMPRODUCT to SUM.

It would be great if some Excel guru could explain why my formula is
not behaving as I expect -- and more importantly, if there is
reasonable way to make the formula behave the way that I expect.

I do not expect it to behave as well as summing 23 RAND() calls
anyway. But you said you were looking for something more compact, and
you did not say if/how concerned you are about the distribution.
(Other than the fact that it should work, of course. I take that
granted.)

I'm glad you checked the distribution. I had intended to do that
myself, but I forgot.


----- original posting -----
 
J

joeu2004

ROW(...) is generating an array *but* RANDBETWEEN won't
accept the array and only takes the first element of the
array for the argument.

Oh, I think I understand. I thought array formulas caused Excel to
iterate over the array. That is, {RANDBETWEEN(0,ROW(A1:A4))} would
turn into
{RANDBETWEEN(0,1);RANDBETWEEN(0,2);RANDBETWEEN(0,3);RANDBETWEEN(0,4)}.
But you seem to be saying it translates into RANDBETWEEN(0,
{1;2;3;4}). Ergo, an array formula can be used only by functions that
accept arrays. Makes sense.

And some array formulas involve functions that return an array
result. Isn't that what's happening with FREQUENCY(A1:A10,B1:B10)?
That seems to be what you are alluding to below. But that was never
my expectation in my RANDBETWEEN usage.

As Harlan noted in his reply to me, RAND, and in this
case RANDBETWEEN, don't return an array.

The difference was that I passed a array to RANDBETWEEN, whereas you
simply called RAND(). You seemed to be expecting RAND()*ROW(A1:A3) to
translate into {RAND()*1;RAND()*2;RAND()*3}. I would have expected
RAND()*{1;2;3}. So Harlan's comment was no surprise to me: RAND() is
computed only once.

Oh well, water under the bridge. Thanks for the explanation. It
should help me in the future.
 
B

Bernd P

Hello,

IMHO with Excel's current precision limit of 15 significant digits we
can safely identify the sum of 23 rand()'s with your norminv approach.

We can create a series of random numbers, then maybe transform it to
the desired mean and stdev:
http://www.sulprobil.com/html/norminv.html

[Just discard the series if min < 0 or max > 23]

Regards,
Bernd
 
T

T. Valko

an array formula can be used only by functions that accept arrays.

In this application, yes, that's correct.
And some array formulas involve functions that return an array result.
Isn't that what's happening with FREQUENCY(A1:A10,B1:B10)?

Yes, that's correct. ROW() is another function that returns an array. Even
if it's simply ROW(A1) it's still a single element array.
The difference was that I passed a array to RANDBETWEEN,
whereas yousimply called RAND().

Yes, but as I noted, RANDBETWEEN won't take arrays as arguments so it
operates on the first element of the array.
You seemed to be expecting RAND()*ROW(A1:A3) to
translate into {RAND()*1;RAND()*2;RAND()*3}.

No, I expected RAND()*{1;2;3}.


--
Biff
Microsoft Excel MVP


ROW(...) is generating an array *but* RANDBETWEEN won't
accept the array and only takes the first element of the
array for the argument.

Oh, I think I understand. I thought array formulas caused Excel to
iterate over the array. That is, {RANDBETWEEN(0,ROW(A1:A4))} would
turn into
{RANDBETWEEN(0,1);RANDBETWEEN(0,2);RANDBETWEEN(0,3);RANDBETWEEN(0,4)}.
But you seem to be saying it translates into RANDBETWEEN(0,
{1;2;3;4}). Ergo, an array formula can be used only by functions that
accept arrays. Makes sense.

And some array formulas involve functions that return an array
result. Isn't that what's happening with FREQUENCY(A1:A10,B1:B10)?
That seems to be what you are alluding to below. But that was never
my expectation in my RANDBETWEEN usage.

As Harlan noted in his reply to me, RAND, and in this
case RANDBETWEEN, don't return an array.

The difference was that I passed a array to RANDBETWEEN, whereas you
simply called RAND(). You seemed to be expecting RAND()*ROW(A1:A3) to
translate into {RAND()*1;RAND()*2;RAND()*3}. I would have expected
RAND()*{1;2;3}. So Harlan's comment was no surprise to me: RAND() is
computed only once.

Oh well, water under the bridge. Thanks for the explanation. It
should help me in the future.
 
J

joeu2004

The Central Limit Theory says that the sum of independent random
variables (all with the same distribution) is a normal distribution.
[....]
the following is a convenient way to generate one sum of 23 random
values between 0 and 1:

=norminv(rand(), 23/2, sqrt(23/12))

In that formula, RAND() is used simply to generate a random
probability for the NORMINV() function. The theory behind the formula
is as follows....

According to the CLT, the sum is a normal distribution regardless of
the distribution of the individual random variables being summed. The
distribution of the sum has a mean of N*m and a std dev of sqrt(N)*s,
where N is the number of random variables summed, and "m" and "s" are
the mean and std dev of each of the N random variables.

The OP indicated that the random variables are RAND(). RAND() should
have a uniform distribution (U) between 0 and 1. The mean of U(a,b)
is (b+a)/2, and the std dev is (b-a)/sqrt(12). For RAND(), these
simplify to 1/2 and 1/sqrt(12).

So the sum of 23 RAND() calls has a mean of 23*(1/2) and a std dev of
sqrt(23)*(1/sqrt(12)), which simplify to 23/2 and sqrt(23/12).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top