How to randomly split a whole dataset into two sub-dataset?

  • Thread starter Thread starter zencaroline
  • Start date Start date
Z

zencaroline

Hi,

At your possible convenience, might anyone please kindly answer my
question? Thank you very much.

How to "RANDOMLY" split the whole data set (n=2000) into two sub
dataset (n=1000; n=1000) in SPSS or Excel?

Thank you very much.

Please take care

Caroline
 
zencaroline said:
How to "RANDOMLY" split the whole data set (n=2000) into two sub
dataset (n=1000; n=1000) in SPSS or Excel?

Create a new variable whose value is randomly distributed. Sort
the data on this variable. Take the first 1000 cases as the
first sub-dataset, the last 1000 cases as the second sub-dataset.
 
Caroline -

In Excel, one method is to use a "helper column."

Arrange the data in a single column.

In an adjacent column, enter =RAND() into the top cell, copy, and paste to
the other cells, so there's a random number associated with each adjacent
data value.

As an optional additional step, select the column with random numbers, copy,
and Paste Special Values (so that the random numbers don't change after the
sort).

Select the entire range of data values (two columns wide and 2000 rows
long).

Choose Data | Sort, "Sort by" the column containing the random numbers, and
click OK.

Use the first 1000 sorted data values for one sub data set, and use the
others for the second sub data set.

- Mike

www.MikeMiddleton.com
 
Hello Caroline,

If your data is in A1:A2000, select B1:B2000 and array-enter
=UniqRandInt(2,1000)
[enter with CTRL + SHIFT + ENTER, not only with ENTER]

My user-defined function UniqRandInt you can get here:
http://www.sulprobil.com/html/uniqrandint.html
[Press ALT + F11, insert a new module, copy my macro text into that
new module and go back to your worksheet]

You will get exactly 1000 ones and 1000 twos which indicate the
subset.

Regards,
Bernd
 
* create a random variable.
compute ranorder= rv.uniform(1, 2E9).
*sort the cases in a random order.
sort cases by ranorder.
*assign to groups by odd and even casenum.
compute group = mod($casenum,2) .
value labels group 0 'even' 1 'odd'.
frequencies vars=group.


Art Kendall
Social Research Consultants
 
Back
Top