Proportionally weighted random generator?

G

Guest

I have a set of data that contains page numbers from a report that I run.
These page numbers can appear multiple times in any given report based on
activity on that page. I run a frequency of the number of times a given page
appears. I would like to take a random sampling of this report for audit
purposes, but I would like the random sample to be propotionally weighted
based on the frequency a page appears on my reports. For example if ten page
numbers appear on my report, but page number 1 appears four times. I want
page number 1 to be weighted in such a way that it is four times as likely to
appear on my random sampling. Is there anyway to set a up a generator to
make this happen? Thanks in advance.
 
G

Guest

make a copy of your list with each entry separate
four separate listings of page 1 from your example
next to this column enter = rand() copy and paste in the cells next to each
page listing
select both columns and sort based on the rand() column
If you are willing to check page 1 more than once just take the top or
bottom number of cells to check

if you just want the selection to have mulitple chances and you only want to
test page1 oncejsust select the top unique numbers in your list. there are
many ways ot do this.
 
J

joeu2004

I would like to take a random sampling of this report for audit
purposes, but I would like the random sample to be propotionally weighted
based on the frequency a page appears on my reports.

Suppose you currently have two columns of data, the page numbers
(B1:B100) and their frequencies (C1:C100).

Add a third column to the left (which may be hidden) for the
cumulative frequency. It starts at zero; subsequent cells add the
frequency of the previous row (e.g. A2 is =A1+C1).

The random sample can be selected by:

=vlookup(sum(C1:C100)*rand(), A1:C100, 2)

Of course, you could compute SUM(C1:C100) once in a cell (e.g. A101)
and refer to that cell in the VLOOKUP() function.

Note: I do not like the fact that RAND() is recomputed every time the
spreadsheet is modified -- anywhere(!), and I do not like to disable
auto calculation. So I create a UDF (myrand) to perform the Rnd()
computation. You could use ctrl-alt-F9 to force the random selection
to be recalculated. Or you could pass a cell or range reference to
the UDF solely for the purpose of recalculating the random selection
whenever a value in that cell or range is changed.


----- complete previous posting -----
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Top