PC Review


Reply
Thread Tools Rate Thread

density dependent random draw without replacement

 
 
Niels
Guest
Posts: n/a
 
      16th May 2007
Wow, this works indeed as I described. And thank you for introducing arrays
(-; The only thing is that I did not completely describe my problem as I had
in mind. I should have included density dependent. If I again refer to the
list below, I want to randomly draw 5 unique values but the chance of
drawing 4 should be 3 times as high as drawing 1 etc. Something like a
density dependent random draw of unique values. I hope you or anybody else
can help me out, best,

Niels

> identifier
> 1
> 2
> 2
> 3
> 4
> 4
> 4
> 5
> 5
> 6
> 7
> 7
> 8
> 9
> 9
> 10


"RagDyeR" <(E-Mail Removed)> wrote in message
news:%(E-Mail Removed)...
>I would think that you'll first need to make a unique list of your points,
> and then draw your random samplings from that unique list.
>
> With your list in say A1 to a20,
> And the Rand() function in Column B,
> Enter this in C1:
>
> =A1
>
> Enter this *array* formula in C2:
>
> =IF(ISERROR(MATCH(0,COUNTIF(C$1:C1,$A$1:$A$20&""),0)),"",INDEX(IF(ISBLANK($A$1:$A$20),"",$A$1:$A$20),MATCH(0,COUNTIF(C$1:C1,$A$1:$A$20&""),0)))
>
> --
> Array formulas are entered using CSE, <Ctrl> <Shift> <Enter>, instead of
> the
> regular <Enter>, which will *automatically* enclose the formula in curly
> brackets, which *cannot* be done manually. Also, CSE *must* be used when
> revising the formula.
>
> *After* the CSE entry, copy down 'til you get blanks.
>
> This is now the datalist to draw from.
>
> Enter this formula in D1:
>
> =INDEX(C:C,RANK(B1,$B$1:INDEX(B:B,COUNT(C:C))))
>
> And copy down as needed.
>
> Now, each hit of <F9> will return a new random list.
> --
>
> HTH,
>
> RD
> =====================================================
> Please keep all correspondence within the Group, so all may benefit!
> =====================================================
>
>
>
> "Niels" <(E-Mail Removed)> wrote in message
> news:464858ad$0$59086$(E-Mail Removed)...
> Hello,
>
> I have the following problem. I managed to make a random draw without
> replacement from
> http://www.staff.city.ac.uk/r.j.gerr...q.html#sample:
>
> "If the population is in Column A, and Column B is filled with =RAND(),
> then
> enter the formula =INDEX(A:A,RANK(B1,B:B)) in C1 and copy down to Row n,
> assuming a sample of size n is required."
>
> Now, I have a set of point localities in geographic space each with an
> unique identifier indication a location in geographic space. At certain
> localities I have multiple points. Now I want to select 5 random points
> with
> no duplication from the identifier column
>
> identifier
> 1
> 2
> 2
> 3
> 4
> 4
> 4
> 5
> 5
> 6
> 7
> 7
> 8
> 9
> 9
> 10
>
> So, how can I automatically randomly draw 5 unique values from this list.
> Thanks
>
> Niels



 
Reply With Quote
 
 
 
 
Bernd
Guest
Posts: n/a
 
      16th May 2007
Hello,

Your example can be done with:
=INT(redw(1,2,1,3,2,1,2,1,2,1)*10+1)

The user-defined-function redw you can find at
http://www.sulprobil.com/html/redw.html

Regards,
Bernd

 
Reply With Quote
 
Niels
Guest
Posts: n/a
 
      16th May 2007
Yes, could if this hypothetical distribution was all I was using. In reality
I am dealing with over 100 000 records (still have to solve that one in
Excel). But, thanks anyway,

Niels

"Bernd" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
> Hello,
>
> Your example can be done with:
> =INT(redw(1,2,1,3,2,1,2,1,2,1)*10+1)
>
> The user-defined-function redw you can find at
> http://www.sulprobil.com/html/redw.html
>
> Regards,
> Bernd
>



 
Reply With Quote
 
Bernd
Guest
Posts: n/a
 
      16th May 2007
Hello Niels,

Why don't you take my UDF lfreq from
http://www.sulprobil.com/html/listfreq.html
and apply a modified version of redw to its output?

Regards,
Bernd

 
Reply With Quote
 
Niels
Guest
Posts: n/a
 
      16th May 2007
Hello Bernd,

I would love to be able to understand the scripts you are referring to, but
at this stage I don't unfortunately. Let me clarify the problem.

I have 1837 gridcells in environmental space. Each of these cells contain
plant collections. Some many, others only few. What I need is a random draw
from these cells, where cells with many collections having a proportionately
higher chance of being selected. However, each cell can only be represented
once. I use this to contruct a null distribution. Meaning I need 999x5
cells, 999x6 cells etc. up to 999x100 cells. I do now how to write a macro
to repeat the process 999 times, but making this density dependent random
draw without replacement not. If there is an easy solution would be great,

Regards,

Niels

"Bernd" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
> Hello Niels,
>
> Why don't you take my UDF lfreq from
> http://www.sulprobil.com/html/listfreq.html
> and apply a modified version of redw to its output?
>
> Regards,
> Bernd
>



 
Reply With Quote
 
Bernie Deitrick
Guest
Posts: n/a
 
      16th May 2007
Niels,

Here is a unique-value, density-dependent solution that will work for up to about 65000 samples -
more if you upgrade to XL2007.

Assume that your values are in column A. In column B, use the formula

=RAND()

and copy to match your values in column A.

In cell C2 (I assume that you have headers in row 1), array-enter the formula (enter using
Ctrl-Shift-Enter)

=IF(B2=MAX(IF($A$2:$A$XXXXX=A2,$B$2:$B$XXXXX)),B2,"")

(Replace the XXXXXs with the row number of your last formula in column B)

This formula will highlight the highest random number for the family of values in column A. (The
more members of the family, the higher the probability that one of those members will have a higher
random number, thus a better chance of being drawn...)

In Cell D2, enter the formula

=IF(C2<>"",RANK(C2,C:C),"")

Starting in cell E2, enter the number series 1,2,3....X down column E, where X is the number of
unique values you need.

Then in F2, enter the formula

=INDEX(A:A,MATCH(E2,D,FALSE))

and copy down to match your numbers in column E.

This will produce a random draw of values from column A, with those values appearing more often
having a better chance of being drawn.

You can experiment by recalculating and seeing which values are drawn - but of course, don't be
surprised if a low frequency value is drawn - that is randomness....

IF you need to work with more than 65000 or so values, use a randomized selection routine to select
65000 values from your larger population - certainly a valid technique, if not ideal.

HTH,
Bernie
MS Excel MVP


"Niels" <(E-Mail Removed)> wrote in message news:464ac089$0$5479$(E-Mail Removed)...
> Wow, this works indeed as I described. And thank you for introducing arrays
> (-; The only thing is that I did not completely describe my problem as I had
> in mind. I should have included density dependent. If I again refer to the
> list below, I want to randomly draw 5 unique values but the chance of
> drawing 4 should be 3 times as high as drawing 1 etc. Something like a
> density dependent random draw of unique values. I hope you or anybody else
> can help me out, best,
>
> Niels
>
>> identifier
>> 1
>> 2
>> 2
>> 3
>> 4
>> 4
>> 4
>> 5
>> 5
>> 6
>> 7
>> 7
>> 8
>> 9
>> 9
>> 10

>
> "RagDyeR" <(E-Mail Removed)> wrote in message
> news:%(E-Mail Removed)...
>>I would think that you'll first need to make a unique list of your points,
>> and then draw your random samplings from that unique list.
>>
>> With your list in say A1 to a20,
>> And the Rand() function in Column B,
>> Enter this in C1:
>>
>> =A1
>>
>> Enter this *array* formula in C2:
>>
>> =IF(ISERROR(MATCH(0,COUNTIF(C$1:C1,$A$1:$A$20&""),0)),"",INDEX(IF(ISBLANK($A$1:$A$20),"",$A$1:$A$20),MATCH(0,COUNTIF(C$1:C1,$A$1:$A$20&""),0)))
>>
>> --
>> Array formulas are entered using CSE, <Ctrl> <Shift> <Enter>, instead of
>> the
>> regular <Enter>, which will *automatically* enclose the formula in curly
>> brackets, which *cannot* be done manually. Also, CSE *must* be used when
>> revising the formula.
>>
>> *After* the CSE entry, copy down 'til you get blanks.
>>
>> This is now the datalist to draw from.
>>
>> Enter this formula in D1:
>>
>> =INDEX(C:C,RANK(B1,$B$1:INDEX(B:B,COUNT(C:C))))
>>
>> And copy down as needed.
>>
>> Now, each hit of <F9> will return a new random list.
>> --
>>
>> HTH,
>>
>> RD
>> =====================================================
>> Please keep all correspondence within the Group, so all may benefit!
>> =====================================================
>>
>>
>>
>> "Niels" <(E-Mail Removed)> wrote in message
>> news:464858ad$0$59086$(E-Mail Removed)...
>> Hello,
>>
>> I have the following problem. I managed to make a random draw without
>> replacement from
>> http://www.staff.city.ac.uk/r.j.gerr...q.html#sample:
>>
>> "If the population is in Column A, and Column B is filled with =RAND(),
>> then
>> enter the formula =INDEX(A:A,RANK(B1,B:B)) in C1 and copy down to Row n,
>> assuming a sample of size n is required."
>>
>> Now, I have a set of point localities in geographic space each with an
>> unique identifier indication a location in geographic space. At certain
>> localities I have multiple points. Now I want to select 5 random points
>> with
>> no duplication from the identifier column
>>
>> identifier
>> 1
>> 2
>> 2
>> 3
>> 4
>> 4
>> 4
>> 5
>> 5
>> 6
>> 7
>> 7
>> 8
>> 9
>> 9
>> 10
>>
>> So, how can I automatically randomly draw 5 unique values from this list.
>> Thanks
>>
>> Niels

>
>



 
Reply With Quote
 
Bernd
Guest
Posts: n/a
 
      16th May 2007
Hello Niels,

Would it be a solution to simplify the drawing process like this:
1. Put the number of plant collections for each gridcells into
A1:A1837. If for example gridcell number 17 has 47 plant collections
then enter 47 into A17.
2. Then use a modified redw version which uses an input range (and not
a vararray):
B1:
=redw(A1:A1837)
B2 (showing the gridcell):
=INT(1+1837*B1))
B3 (showing the plant number within the gridcell):
=INT(1+OFFSET(A1,B2-1,0)*RAND())

If this approach would help and you need further help to modify my UDF
tell me, pls.

Regards,
Bernd

 
Reply With Quote
 
 
 
Reply

Thread Tools
Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Random Draw ZZ Microsoft Excel Discussion 4 26th Oct 2010 02:26 PM
What is the difference between High-Density and Low-Density DDR-RAM modules ? Michael Hertz Storage Devices 4 20th Sep 2010 09:39 AM
How can I draw and Normal Density Function, i.e., a Bell Curve, i. =?Utf-8?B?Vi4gSm9zZXBoIEhvdHo=?= Microsoft Powerpoint 2 30th Jul 2010 02:43 PM
What is the difference between High-Density and Low-Density DDR-RAM modules ? Michael Hertz Windows XP Hardware 0 12th Jan 2008 02:28 PM
random draw without replacement from a distribution Niels Microsoft Excel Discussion 3 16th May 2007 04:35 PM


Features
 

Advertising
 

Newsgroups
 


All times are GMT +1. The time now is 11:24 PM.