Bias in rand for excel 07

Y

yttrias

I'm testing random numbers for games using excel 07 generator, but there is
too much bias away from statistical expectation. For example, when two
integer numbers are generated between 1 and 6, and subsequently added, the
results differ from the easily predictable numbers. Sevens are consistently
in short supply by about 1.5%. That's a lot! Any suggestions for better
randomization in excel would be very much appreciated.
 
J

Joel

Here are my results from excel 2003

1 1652
2 1715
3 1689
4 1668
5 1661
6 1615



Sub countrand()

Dim numbers(6)

For i = 1 To 6
numbers(i) = 0
Next i

For i = 1 To 10000
myRand = Int(6 * Rnd()) + 1
numbers(myRand) = numbers(myRand) + 1
Next i

For i = 1 To 6
Range("A" & i) = i
Range("B" & i) = numbers(i)
Next i

End Sub
 
S

SteveM

And what was your sample size?

The flaw in the MS random number generator has been known for some
time. A description is included in this MS info sheet:

http://support.microsoft.com/kb/829208

I use a simulation package that has it's own generator. But there are
probably others you can download via a web search.

You may want to run statistical tests on ones that you are considering
for validation.

SteveM
 
M

Mike Middleton

yttrias -

Which "excel 07 generator" are you using?

RAND() worksheet function, or

RND function in VBA, or

Excel 2007, Data ribbon, Data Analysis, Random Number Generation tool ?

- Mike Middleton
http://www.DecisionToolworks.com
Decision Analysis Add-ins for Excel
 
Y

yttrias

I ran 1 million samples using RAND in VBA in excel 07. Then I sampled 40,000
consecutive values at 20 different locations throughout the range of 1M. The
average of those 20 windows in the 1M was 1587, where the expected average
should be 1667.
 
J

Jerry W. Lewis

I don't know about the OP, but Joel's posted code used the VBA Rnd()
function. Prior to 2007, there were 3 distinct random number generators, the
worksheet function RAND(), VBA Rnd(), and the ATP random number generator.
None were very good before 2003, but their problems would arise as high order
autocorrelations between "random" numbers not as the kind of problems the
Joel was testing for. It is unlikely to show up in the process that the OP
seems to describe. Therefore I would want to know how the OP generated
"integer numbers between 1 and 6" and what the OP was using as expected
values.

A decent, but not great algorithm was implemented for worksheet RAND()
(requires SP1 patch to work properly) and carried into 2007. I have seen no
evidence that VBA Rnd() has ever been upgraded. RANDBETWEEN() became a
worksheet function in 2007, and therefore probably switched from using ATP
random number generator to using the worksheet RAND() function, though I have
seen no confirmation of this.

Joel's simulation shows departures from expectation that are NOT
statistically significant. Generating 10000 sums of 2 values using worksheet
RAND() and expected values of (6-ABS(7-tot2))/36, I tried several reps in
2003 with no statistically significant departure from expectation. The
following VBA code does 10 reps of 10000 using VBA Rnd(), and only one rep
was statistically different from expectation (again in 2003). Therefore, I
am somewhat skeptical of the OP's claims until more information is provided.

Jerry

Sub tryit()
Application.ScreenUpdating = False ' to speed execution
For j = 1 To 10
For i = 1 To 10000
x = Fix(Rnd() * 6) + Fix(Rnd() * 6) + 2 ' sum of two random
integers between 1 and 6
Cells(x, j) = Cells(x, j) + 1
Next i
Next j
Application.ScreenUpdating = True
End Sub
 
J

Jerry W. Lewis

You are still providing too little information for us to understand what you
are doing.

There is no RAND() function in VBA. In VBA, you either used the VBA Rnd()
function, or the worksheet RAND() function via Evaluate("RAND()").
As noted in my earlier post, in 2003 the worksheet RAND() function was far
superior to the VBA Rnd(). I have seen no evidence that either function
changed between 2003 and 2007.

Your original post discussed the sum of two (presumably independent) random
integers between 1 and 6 (such as the total from rolling two fair dice). The
expected value of a single such sum would be 7. It is unclear how that
relates to either your observed average of 1587 or expected average of 1667
for an "average of 20 windows of 40,000 consecutive values.

It sounds as though your VBA would involve only a few lines of code. If you
paste that code into your reply, we can see exactly what you did.

Jerry
 
R

Rick Rothstein \(MVP - VB\)

I ran 1 million samples using RAND in VBA in excel 07. Then I sampled
40,000
consecutive values at 20 different locations throughout the range of 1M.
The
average of those 20 windows in the 1M was 1587, where the expected average
should be 1667.

How are you using th Randomize statement? If you are executing it more than
one time, that could be skewing your results. Consider this posting I have
posted in the past...

The Randomize statement should not be executed more than once for the
lifespan of the object that is executing your code. Doing so more often
actually makes the data less random than if Randomize is only run once. For
so few colors, and the probable use the OP wants to use the Rnd function
for, it will more than likely not matter here, but it is the concept that is
important to know. I can't demo it within Excel because I am not aware of an
available object where I can set the color of individual pixels (like a
PictureBox control in the compiled version of VB for those who have worked
with that language); however, I do have a VB program that demonstrates this
fact visually.

For those of you having access to the compiled versions of VB5 or VB6, here
is posting I have offered over in the compiled VB newsgroups in the past
that demonstrates this fact...

Running Randomize multiple times ends up producing a less random set of
numbers. To see the problem visually, use this code (which is a modification
of a routine Bob Butler once posted). Start a new project and put two
PictureBox'es on your Form (use the default names for everything and
placement of the PictureBox'es is not important). Paste the following code
into the Form's code window. The PictureBox on the left results from using
Randomize only once, the one on the right uses it repeatedly. Both
PictureBox displays are produced from the same looping code with the only
difference being the use of the Randomize statement Ignoring the pronounce
vertical areas (not sure what that is, probably some kind of boundary
rounding problem), for which one does the distribution of colors look more
"random"?

Rick

Const SCALESIZE = 3

Private Sub Form_Load()
Picture1.ScaleMode = 3
Picture2.ScaleMode = 3
Randomize
Picture1.Move 0, 0, _
128 * Screen.TwipsPerPixelX * SCALESIZE, _
128 * Screen.TwipsPerPixelY * SCALESIZE
Picture2.Move Picture1.Width, 0, _
128 * Screen.TwipsPerPixelX * SCALESIZE, _
128 * Screen.TwipsPerPixelY * SCALESIZE
Me.Width = 2.02 * Picture1.Width
Me.Height = 1.1 * Picture1.Height
End Sub

Private Sub Picture1_Paint()
Dim i As Long
Dim j As Long
Dim colr As Long
Dim bitmask As Long
For i = 0 To Picture1.ScaleHeight Step SCALESIZE
For j = 0 To Picture1.ScaleWidth Step SCALESIZE
colr = Rnd * 16711680
Picture1.Line (j, i)-Step(SCALESIZE, _
SCALESIZE), colr, BF
Next j
Next i
For i = 0 To Picture2.ScaleHeight Step SCALESIZE
For j = 0 To Picture2.ScaleWidth Step SCALESIZE
Randomize
colr = Rnd * 16711680
Picture2.Line (j, i)-Step(SCALESIZE, _
SCALESIZE), colr, BF
Next j
Next i
End Sub
 
P

Peter T

"Rick Rothstein (MVP - VB)" wrote in message
For those of you having access to the compiled versions of VB5 or VB6, here
is posting I have offered over in the compiled VB newsgroups in the past
that demonstrates this fact...
Running Randomize multiple times ends up producing a less random set of
numbers. To see the problem visually, use this code (which is a modification
of a routine Bob Butler once posted). Start a new project and put two
PictureBox'es on your Form (use the default names for everything and
placement of the PictureBox'es is not important). Paste the following code
into the Form's code window. The PictureBox on the left results from using
Randomize only once, the one on the right uses it repeatedly. Both
PictureBox displays are produced from the same looping code with the only
difference being the use of the Randomize statement Ignoring the pronounce
vertical areas (not sure what that is, probably some kind of boundary
rounding problem), for which one does the distribution of colors look more
"random"?

Rick

In particular this bit -
Ignoring the pronounced vertical areas (not sure what that is,
probably some kind of boundary rounding problem)

Is it some rounding problem or could it be related to the non random nature
of random, even with the single Randomize.

Actually I didn't notice those vertical areas you refer to at first in the
left box, however I amended a couple of values in your code, in particular
changed colr = Rnd * 16711680 to colr = Rnd * 16777215, also increased the
sizes by 1 pixel to exaggerate.

Now I get very distinct not entirely random vertical areas in the left box.

For the second box I changed Randomize in each loop to Randomize 1, just for
fun.

Amended code -

Const SCALESIZE = 3

Private Sub Form_Load()
Const Z As Long = 129 ' original 128
Me.Left = 10

Picture1.ScaleMode = 3
Picture2.ScaleMode = 3
Randomize
Picture1.Move 0, 0, _
Z * Screen.TwipsPerPixelX * SCALESIZE, _
Z * Screen.TwipsPerPixelY * SCALESIZE
Picture2.Move Picture1.Width, 0, _
Z * Screen.TwipsPerPixelX * SCALESIZE, _
Z * Screen.TwipsPerPixelY * SCALESIZE
Me.Width = 2.02 * Picture1.Width
Me.Height = 1.1 * Picture1.Height
End Sub

Private Sub Picture1_Paint()
Dim i As Long
Dim j As Long
Dim colr As Long
Dim bitmask As Long

Const C As Long = 16777215 ' original 16711680

For i = 0 To Picture1.ScaleHeight Step SCALESIZE
For j = 0 To Picture1.ScaleWidth Step SCALESIZE
colr = Rnd * C
Picture1.Line (j, i)-Step(SCALESIZE, _
SCALESIZE), colr, BF
Next j
Next i

For i = 0 To Picture2.ScaleHeight Step SCALESIZE
For j = 0 To Picture2.ScaleWidth Step SCALESIZE
Randomize ' original
Randomize 1
colr = Rnd * C
Picture2.Line (j, i)-Step(SCALESIZE, _
SCALESIZE), colr, BF
Next j
Next i
End Sub

Regards,
Peter T
 
P

Peter T

A similar demo for a VBA Userform -

Option Explicit
Private Declare Function FindWindow Lib "user32" _
Alias "FindWindowA" _
(ByVal lpClassName As String, _
ByVal lpWindowName As String) As Long

Private Declare Function SetPixel Lib "gdi32" _
(ByVal hdc As Long, ByVal x As Long, _
ByVal y As Long, ByVal crColor As Long) As Long

Private Declare Function GetPixel Lib "gdi32" _
(ByVal hdc As Long, ByVal x As Long, ByVal y As Long) As Long

Private Declare Function GetDC Lib "user32" ( _
ByVal hwnd As Long) As Long
Private Declare Function ReleaseDC Lib "user32" ( _
ByVal hwnd As Long, ByVal hdc As Long) As Long

Private Sub PaintPixels()
Dim hwnd As Long, hdc As Long
Dim tp As Long, lt As Long
Dim x As Long, y As Long
Dim colr As Long

Const Z As Long = 128 * 2 - 1
Const C As Long = 16777215

Me.Left = 10: Me.Top = 10
Me.Width = (Z * 2) * 0.75 + 45: Me.Height = Z * 0.75 + 60
' if form is too small, change 0.75 to 1 or 1.25

hwnd = FindWindow("ThunderDFrame", Me.Caption)

hdc = GetDC(hwnd)

tp = Me.Top + 15
lt = Me.Left + 10

Randomize
For y = tp To tp + Z
For x = lt To lt + Z
colr = Rnd * C
SetPixel hdc, x, y, colr
Next
Next

lt = lt + Z + 15
For y = tp To tp + Z
For x = lt To lt + Z
Randomize 1
colr = Rnd * C
SetPixel hdc, x, y, colr
Next
Next

ReleaseDC hwnd, hdc
End Sub

Private Sub UserForm_Activate()

Me.Caption = "Click me to (re-) PaintPixels"
PaintPixels
End Sub

Private Sub UserForm_Click()
' Me.Repaint
PaintPixels

End Sub

Peter T
 
R

Rick Rothstein \(MVP - VB\)

Actually I didn't notice those vertical areas you refer to at first in the
left box, however I amended a couple of values in your code, in particular
changed colr = Rnd * 16711680 to colr = Rnd * 16777215, also increased the
sizes by 1 pixel to exaggerate.

It was quite some time ago when I developed that particular demonstration,
so I forgot some of the original details. Actually, now that I think about
it, I believe I adjusted the numbers to minimize the vertical streak I
mentioned and didn't think about it when I grabbed the text from a different
message I had posted about the original (pre-modified) code. The constant
multiplier for the Rnd function was the key, as I remember. Nice thought
about the 1 pixel adjustment.

Rick
 
P

Peter T

"Rick Rothstein (MVP - VB)" wrote in message
It was quite some time ago when I developed that particular demonstration,
so I forgot some of the original details. Actually, now that I think about
it, I believe I adjusted the numbers to minimize the vertical streak I
mentioned and didn't think about it when I grabbed the text from a different
message I had posted about the original (pre-modified) code. The constant
multiplier for the Rnd function was the key, as I remember. Nice thought
about the 1 pixel adjustment.

Rick

This bit -
The constant
multiplier for the Rnd function was the key, as I remember.

The constant multiplier, depending on what it is, helps to highlight there
is little if anything random about vb/vba's Rnd. Instead it seems to
generate a sequence, albeit not an immediately obvious one. I guess this is
all fully documented and for most purposes of little consequence.

I examined the colours generated in the vba-userform demo (posted nearby).
In visually obvious bands adjacent RGB's in a series might be something like
(say) Red & Green are identical but Blue increments uniformly, then repeat
with a small increment to the G attribute. Or there might be an obvious
chequerboard pattern of colours in a section, with every other colour almost
identical.

Regards,
Peter T
 
R

Rick Rothstein \(MVP - VB\)

Ah, so that is how you draw into a UserForm... ThunderDFrame... I'll have to
remember that.

Thanks for performing the conversion. First, I removed the "1" off of the
Randomize statement in the second loop so that both Randomize statements
draw from the Timer for their seed values. Yes, this scatters the bar effect
on the second picture, but I think it is a fairer display of what is going
on. Second, I think if you change the Const C assignment statement in the
PaintPixel procedure to this...

Const C As Long = 16711680

you get a much starker (and more amazing) contrast between the two displays,
especially with the the Randomizer seed change mention above, especially
when you repeatedly click the UserForm. Yes, this removes the sharp vertical
lines that the seed value of 1 produced, but it still shows (and more fairly
I think) the sharp difference in randomization that occurs if you use the
Randomize statement only once. The modified code implementing the above is
shown below my signature.

Again, thanks for performing the compiled VB to Excel VBA conversion... you
did a nice job with it. I will be using your coded translation (with
appropriate acknowledgement to you, of course) should a similar question
rise in the future.

Rick

Private Declare Function FindWindow Lib "user32" _
Alias "FindWindowA" _
(ByVal lpClassName As String, _
ByVal lpWindowName As String) As Long

Private Declare Function SetPixel Lib "gdi32" _
(ByVal hDc As Long, _
ByVal x As Long, _
ByVal y As Long, _
ByVal crColor As Long) As Long

Private Declare Function GetPixel Lib "gdi32" _
(ByVal hDc As Long, _
ByVal x As Long, _
ByVal y As Long) As Long

Private Declare Function GetDC Lib "user32" ( _
ByVal hWnd As Long) As Long

Private Declare Function ReleaseDC Lib "user32" ( _
ByVal hWnd As Long, ByVal hDc As Long) As Long

Private Sub PaintPixels()
Dim hWnd As Long, hDc As Long
Dim tp As Long, lt As Long
Dim x As Long, y As Long
Dim colr As Long

Const Z As Long = 128 * 2 - 1
Const C As Long = 16711680

Me.Left = 10: Me.Top = 10
Me.Width = (Z * 2) * 0.75 + 45: Me.Height = Z * 0.75 + 60
' if form is too small, change 0.75 to 1 or 1.25

hWnd = FindWindow("ThunderDFrame", Me.Caption)
hDc = GetDC(hWnd)

tp = Me.Top + 15
lt = Me.Left + 10

Randomize
For y = tp To tp + Z
For x = lt To lt + Z
colr = Rnd * C
SetPixel hDc, x, y, colr
Next
Next

lt = lt + Z + 15
For y = tp To tp + Z
For x = lt To lt + Z
Randomize
colr = Rnd * C
SetPixel hDc, x, y, colr
Next
Next

ReleaseDC hWnd, hDc
End Sub

Private Sub UserForm_Activate()
Me.Caption = "Click me to (re-) PaintPixels"
PaintPixels
End Sub

Private Sub UserForm_Click()
' Me.Repaint
PaintPixels
End Sub
 
Y

yttrias

I am grateful to those of you who have responded so quickly to my OP. I
hope you will be able to point out the errors of my ways. The following is
the code sequence for the generation of two sets of 1-million integers
between 1 and 6, and their sum. Then, the subsequent code counts the number
of sevens in a sample of 40,000 sums (divided by 40,000) to give decimal
representation to be compared with ideal ratio of 1/6. I ran the latter code
20 times at arbitrary locations, and averaged those results.

Sub OneMillionEvents()
Range("A1:C1000000").Select
Selection.ClearContents
Selection.NumberFormat = "0"
Range("A1").Select
ActiveCell.FormulaR1C1 = "=1+int(RAND()*6)"
Range("B1").Select
ActiveCell.FormulaR1C1 = "=1+int(RAND()*6)"
Range("C1").Select
ActiveCell.FormulaR1C1 = "=RC[-2]+RC[-1]"
Range("A1:C1000000").Select
Selection.FillDown
Calculate
End Sub

Sub CountSevens()
Dim i, n As Integer
Range("A1").Select
ActiveCell(1, 5) = 40000
ActiveCell(1, 6) = 0
n = 0
For i = 825001 to 865000
If ActiveCell(i, 3) = 7 Then
n = n + 1
End If
Next i
ActiveCell(1, 6) = n / ActiveCell(1, 5)
End Sub

Regards,

Yttrias

--
yttrias


Jerry W. Lewis said:
You are still providing too little information for us to understand what you
are doing.

There is no RAND() function in VBA. In VBA, you either used the VBA Rnd()
function, or the worksheet RAND() function via Evaluate("RAND()").
As noted in my earlier post, in 2003 the worksheet RAND() function was far
superior to the VBA Rnd(). I have seen no evidence that either function
changed between 2003 and 2007.

Your original post discussed the sum of two (presumably independent) random
integers between 1 and 6 (such as the total from rolling two fair dice). The
expected value of a single such sum would be 7. It is unclear how that
relates to either your observed average of 1587 or expected average of 1667
for an "average of 20 windows of 40,000 consecutive values.

It sounds as though your VBA would involve only a few lines of code. If you
paste that code into your reply, we can see exactly what you did.

Jerry

yttrias
 
J

Jerry W. Lewis

The code was very helpful. I agree that the expcted value in F1 should be
1/6=0.666... and that 0.1587 is a sigificant departure from expectation (the
missing decimal points hindered my understanding).

I am surprised that CountSevens() worked at all, since i is declared an
integer, but takes values that are outside the valid range for an
integer--that may have been the problem.

There is a slight bias built into using either the INT() or TRUNC()
functions. For example
=((2-2^-52)-2)
is <0, so =INT(2-2^-52) should be 1, but MS, in trying to hide the binary
underpinnings, returns 2. Therefore there will be a slight reduction in the
number of 6's and a slight increase in the number of 1's in your pseudo
random integers, but these slight biases should not be statistically
detectable in an experiment of your size. If you are concerned, you can
avoid them by using an intermediate cell to hold the =RAND() values which you
would then transform with the slightly more complicated formula
=IF(((x*6)-INT(x*6))<0,INT(x*6)-1,INT(x*6))

Some recalculations of the random integers occur during the execution of
CountSevens(). It is not clear to me whether these might be occurring during
the execution of the For/Next loop; regardless, I don't see how it could
cause an undercount of 7's.

Here is an alternate VBA code, that simplifies and combines the work of both
OneMillionEvents() and CountSevens(). It also replaces the static value in
F1 with a dynamic formula that will update whenever the RAND() functions
recalculate. I tried about a dozen reps in Excel 2007, and got only one rep
that was statistically different from expectation.

Sub SimplerVersion()
With Range("A1:C1000000")
.ClearContents
.NumberFormat = "0"
End With
Range("A1:B1000000").FormulaR1C1 = "=1+INT(RAND()*6)"
Range("C1:C1000000").FormulaR1C1 = "=RC[-2]+RC[-1]"
Range("F1").FormulaArray = "=AVERAGE((R825001C3:R865000C3=7)*1)"
End Sub

Jerry

yttrias said:
I am grateful to those of you who have responded so quickly to my OP. I
hope you will be able to point out the errors of my ways. The following is
the code sequence for the generation of two sets of 1-million integers
between 1 and 6, and their sum. Then, the subsequent code counts the number
of sevens in a sample of 40,000 sums (divided by 40,000) to give decimal
representation to be compared with ideal ratio of 1/6. I ran the latter code
20 times at arbitrary locations, and averaged those results.

Sub OneMillionEvents()
Range("A1:C1000000").Select
Selection.ClearContents
Selection.NumberFormat = "0"
Range("A1").Select
ActiveCell.FormulaR1C1 = "=1+int(RAND()*6)"
Range("B1").Select
ActiveCell.FormulaR1C1 = "=1+int(RAND()*6)"
Range("C1").Select
ActiveCell.FormulaR1C1 = "=RC[-2]+RC[-1]"
Range("A1:C1000000").Select
Selection.FillDown
Calculate
End Sub

Sub CountSevens()
Dim i, n As Integer
Range("A1").Select
ActiveCell(1, 5) = 40000
ActiveCell(1, 6) = 0
n = 0
For i = 825001 to 865000
If ActiveCell(i, 3) = 7 Then
n = n + 1
End If
Next i
ActiveCell(1, 6) = n / ActiveCell(1, 5)
End Sub

Regards,

Yttrias

--
yttrias


Jerry W. Lewis said:
You are still providing too little information for us to understand what you
are doing.

There is no RAND() function in VBA. In VBA, you either used the VBA Rnd()
function, or the worksheet RAND() function via Evaluate("RAND()").
As noted in my earlier post, in 2003 the worksheet RAND() function was far
superior to the VBA Rnd(). I have seen no evidence that either function
changed between 2003 and 2007.

Your original post discussed the sum of two (presumably independent) random
integers between 1 and 6 (such as the total from rolling two fair dice). The
expected value of a single such sum would be 7. It is unclear how that
relates to either your observed average of 1587 or expected average of 1667
for an "average of 20 windows of 40,000 consecutive values.

It sounds as though your VBA would involve only a few lines of code. If you
paste that code into your reply, we can see exactly what you did.

Jerry

yttrias
 
P

Peter T

"Rick Rothstein (MVP - VB)" wrote in message
First, I removed the "1" off of the
Randomize statement in the second loop so that both Randomize statements
draw from the Timer for their seed values.

I had intended to post it that way, somehow forgot!
Second, I think if you change the Const C assignment statement in the
PaintPixel procedure to this...

Const C As Long = 16711680

you get a much starker (and more amazing) contrast between the two
displays

Hmm, for me
Const C As Long = 16777215
gives a very much better visual indication, not so much of the contrast
between the two displays but very obvious vertical bands, even the first
display (not random pixels at all).

I wonder why, maybe Rnd() seeds or calculates slightly differently in our
systems. I had been testing in W98SE and have now repeated in Vista. The
first display visually looks identical in both systems, but the second is
somewhat different; though in both systems the vertical banding is more
obvious in both displays with the 16777215 multiplier. Maybe for some users
neither value will produce anything obvious.
Ah, so that is how you draw into a UserForm... ThunderDFrame... I'll have to
remember that.

and just in case need to cater for Excel97 the userform's classname is
ThunderXFrame
I will be using your coded translation (with
appropriate acknowledgement to you, of course)

It's all pretty generic stuff so no acknowledgement necessary, but
appreciate the thought. The code was the bare minimum for the demo, better
to get system's pointsPerPixel -

Private Declare Function GetDeviceCaps Lib "gdi32" ( _
ByVal hdc As Long, ByVal nIndex As Long) As Long

Private Const POINTS_PER_INCH As Long = 72
Private Const LOGPIXELSX As Long = 88
Dim mdPointsPerPixel As Double

Private Sub GetPPI()
Dim dpi As Long
dpi = GetDeviceCaps(GetDC(0), LOGPIXELSX)
mdPointsPerPixel = POINTS_PER_INCH / dpi
ReleaseDC 0, hdc
End Sub

In the demos replace the 0.75 with mdPointsPerPixel. But somehow I think you
already know all that. <g>

Perhaps use frm.Width - .Insidewidth & .Height -.Insideheight to get the
inside coordinates, net of borders & caption instead of the approximations I
used.

There are of course other, possibly better/faster, methods to draw on a
userform or any window than SetPixel.

Regards,
Peter T
 
Y

yttrias

Thank you very much. With the simplified code you provided, I am able to
average the entire 1 million events in one run, which I couldn't do with the
code I posted. (My system limitation.) With your code, tests of groups of 1
million repeatedly gave avg to within 0.12% of expected. There is more
deviation, of course, when the samples are reduced to 40,000 points, but with
enough repeated samples, I'm getting averages of samples to be pretty close
to the expected value.

This has been helpful to me.
--
yttrias


Jerry W. Lewis said:
The code was very helpful. I agree that the expcted value in F1 should be
1/6=0.666... and that 0.1587 is a sigificant departure from expectation (the
missing decimal points hindered my understanding).

I am surprised that CountSevens() worked at all, since i is declared an
integer, but takes values that are outside the valid range for an
integer--that may have been the problem.

There is a slight bias built into using either the INT() or TRUNC()
functions. For example
=((2-2^-52)-2)
is <0, so =INT(2-2^-52) should be 1, but MS, in trying to hide the binary
underpinnings, returns 2. Therefore there will be a slight reduction in the
number of 6's and a slight increase in the number of 1's in your pseudo
random integers, but these slight biases should not be statistically
detectable in an experiment of your size. If you are concerned, you can
avoid them by using an intermediate cell to hold the =RAND() values which you
would then transform with the slightly more complicated formula
=IF(((x*6)-INT(x*6))<0,INT(x*6)-1,INT(x*6))

Some recalculations of the random integers occur during the execution of
CountSevens(). It is not clear to me whether these might be occurring during
the execution of the For/Next loop; regardless, I don't see how it could
cause an undercount of 7's.

Here is an alternate VBA code, that simplifies and combines the work of both
OneMillionEvents() and CountSevens(). It also replaces the static value in
F1 with a dynamic formula that will update whenever the RAND() functions
recalculate. I tried about a dozen reps in Excel 2007, and got only one rep
that was statistically different from expectation.

Sub SimplerVersion()
With Range("A1:C1000000")
.ClearContents
.NumberFormat = "0"
End With
Range("A1:B1000000").FormulaR1C1 = "=1+INT(RAND()*6)"
Range("C1:C1000000").FormulaR1C1 = "=RC[-2]+RC[-1]"
Range("F1").FormulaArray = "=AVERAGE((R825001C3:R865000C3=7)*1)"
End Sub

Jerry

yttrias said:
I am grateful to those of you who have responded so quickly to my OP. I
hope you will be able to point out the errors of my ways. The following is
the code sequence for the generation of two sets of 1-million integers
between 1 and 6, and their sum. Then, the subsequent code counts the number
of sevens in a sample of 40,000 sums (divided by 40,000) to give decimal
representation to be compared with ideal ratio of 1/6. I ran the latter code
20 times at arbitrary locations, and averaged those results.

Sub OneMillionEvents()
Range("A1:C1000000").Select
Selection.ClearContents
Selection.NumberFormat = "0"
Range("A1").Select
ActiveCell.FormulaR1C1 = "=1+int(RAND()*6)"
Range("B1").Select
ActiveCell.FormulaR1C1 = "=1+int(RAND()*6)"
Range("C1").Select
ActiveCell.FormulaR1C1 = "=RC[-2]+RC[-1]"
Range("A1:C1000000").Select
Selection.FillDown
Calculate
End Sub

Sub CountSevens()
Dim i, n As Integer
Range("A1").Select
ActiveCell(1, 5) = 40000
ActiveCell(1, 6) = 0
n = 0
For i = 825001 to 865000
If ActiveCell(i, 3) = 7 Then
n = n + 1
End If
Next i
ActiveCell(1, 6) = n / ActiveCell(1, 5)
End Sub

Regards,

Yttrias

--
yttrias


Jerry W. Lewis said:
You are still providing too little information for us to understand what you
are doing.

There is no RAND() function in VBA. In VBA, you either used the VBA Rnd()
function, or the worksheet RAND() function via Evaluate("RAND()").
As noted in my earlier post, in 2003 the worksheet RAND() function was far
superior to the VBA Rnd(). I have seen no evidence that either function
changed between 2003 and 2007.

Your original post discussed the sum of two (presumably independent) random
integers between 1 and 6 (such as the total from rolling two fair dice). The
expected value of a single such sum would be 7. It is unclear how that
relates to either your observed average of 1587 or expected average of 1667
for an "average of 20 windows of 40,000 consecutive values.

It sounds as though your VBA would involve only a few lines of code. If you
paste that code into your reply, we can see exactly what you did.

Jerry

:

I ran 1 million samples using RAND in VBA in excel 07. Then I sampled 40,000
consecutive values at 20 different locations throughout the range of 1M. The
average of those 20 windows in the 1M was 1587, where the expected average
should be 1667.
--
yttrias


:

And what was your sample size?
--
Bernard V Liengme
Microsoft Excel MVP
www.stfx.ca/people/bliengme
remove caps from email

I'm testing random numbers for games using excel 07 generator, but there
is
too much bias away from statistical expectation. For example, when two
integer numbers are generated between 1 and 6, and subsequently added, the
results differ from the easily predictable numbers. Sevens are
consistently
in short supply by about 1.5%. That's a lot! Any suggestions for better
randomization in excel would be very much appreciated.

yttrias
 
Y

yttrias

Jerry,

Thanks again for your code suggestion. Although the array formula works
fine, I don't understand the syntax. How does the "*1" function? Can I
substitute a variable for the "7" delineator? (When I try to do that, I get
error.)

Do you know of any published reference to this, and other, helpful array
formula expressions that are (for me) beyond the ordinary?
--
yttrias


Jerry W. Lewis said:
The code was very helpful. I agree that the expcted value in F1 should be
1/6=0.666... and that 0.1587 is a sigificant departure from expectation (the
missing decimal points hindered my understanding).

I am surprised that CountSevens() worked at all, since i is declared an
integer, but takes values that are outside the valid range for an
integer--that may have been the problem.

There is a slight bias built into using either the INT() or TRUNC()
functions. For example
=((2-2^-52)-2)
is <0, so =INT(2-2^-52) should be 1, but MS, in trying to hide the binary
underpinnings, returns 2. Therefore there will be a slight reduction in the
number of 6's and a slight increase in the number of 1's in your pseudo
random integers, but these slight biases should not be statistically
detectable in an experiment of your size. If you are concerned, you can
avoid them by using an intermediate cell to hold the =RAND() values which you
would then transform with the slightly more complicated formula
=IF(((x*6)-INT(x*6))<0,INT(x*6)-1,INT(x*6))

Some recalculations of the random integers occur during the execution of
CountSevens(). It is not clear to me whether these might be occurring during
the execution of the For/Next loop; regardless, I don't see how it could
cause an undercount of 7's.

Here is an alternate VBA code, that simplifies and combines the work of both
OneMillionEvents() and CountSevens(). It also replaces the static value in
F1 with a dynamic formula that will update whenever the RAND() functions
recalculate. I tried about a dozen reps in Excel 2007, and got only one rep
that was statistically different from expectation.

Sub SimplerVersion()
With Range("A1:C1000000")
.ClearContents
.NumberFormat = "0"
End With
Range("A1:B1000000").FormulaR1C1 = "=1+INT(RAND()*6)"
Range("C1:C1000000").FormulaR1C1 = "=RC[-2]+RC[-1]"
Range("F1").FormulaArray = "=AVERAGE((R825001C3:R865000C3=7)*1)"
End Sub

Jerry

yttrias said:
I am grateful to those of you who have responded so quickly to my OP. I
hope you will be able to point out the errors of my ways. The following is
the code sequence for the generation of two sets of 1-million integers
between 1 and 6, and their sum. Then, the subsequent code counts the number
of sevens in a sample of 40,000 sums (divided by 40,000) to give decimal
representation to be compared with ideal ratio of 1/6. I ran the latter code
20 times at arbitrary locations, and averaged those results.

Sub OneMillionEvents()
Range("A1:C1000000").Select
Selection.ClearContents
Selection.NumberFormat = "0"
Range("A1").Select
ActiveCell.FormulaR1C1 = "=1+int(RAND()*6)"
Range("B1").Select
ActiveCell.FormulaR1C1 = "=1+int(RAND()*6)"
Range("C1").Select
ActiveCell.FormulaR1C1 = "=RC[-2]+RC[-1]"
Range("A1:C1000000").Select
Selection.FillDown
Calculate
End Sub

Sub CountSevens()
Dim i, n As Integer
Range("A1").Select
ActiveCell(1, 5) = 40000
ActiveCell(1, 6) = 0
n = 0
For i = 825001 to 865000
If ActiveCell(i, 3) = 7 Then
n = n + 1
End If
Next i
ActiveCell(1, 6) = n / ActiveCell(1, 5)
End Sub

Regards,

Yttrias

--
yttrias


Jerry W. Lewis said:
You are still providing too little information for us to understand what you
are doing.

There is no RAND() function in VBA. In VBA, you either used the VBA Rnd()
function, or the worksheet RAND() function via Evaluate("RAND()").
As noted in my earlier post, in 2003 the worksheet RAND() function was far
superior to the VBA Rnd(). I have seen no evidence that either function
changed between 2003 and 2007.

Your original post discussed the sum of two (presumably independent) random
integers between 1 and 6 (such as the total from rolling two fair dice). The
expected value of a single such sum would be 7. It is unclear how that
relates to either your observed average of 1587 or expected average of 1667
for an "average of 20 windows of 40,000 consecutive values.

It sounds as though your VBA would involve only a few lines of code. If you
paste that code into your reply, we can see exactly what you did.

Jerry

:

I ran 1 million samples using RAND in VBA in excel 07. Then I sampled 40,000
consecutive values at 20 different locations throughout the range of 1M. The
average of those 20 windows in the 1M was 1587, where the expected average
should be 1667.
--
yttrias


:

And what was your sample size?
--
Bernard V Liengme
Microsoft Excel MVP
www.stfx.ca/people/bliengme
remove caps from email

I'm testing random numbers for games using excel 07 generator, but there
is
too much bias away from statistical expectation. For example, when two
integer numbers are generated between 1 and 6, and subsequently added, the
results differ from the easily predictable numbers. Sevens are
consistently
in short supply by about 1.5%. That's a lot! Any suggestions for better
randomization in excel would be very much appreciated.

yttrias
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top