How do I run a regression on data that is not numerical?

  • Thread starter Thread starter Guest
  • Start date Start date
G

Guest

I am using Microsoft Excel 2003. I have been running regressions on
numerical data and am curious to know how to run one if part of my data is
non-numerical such as gender or race.
 
Where only two values are possible (as with gender) then you use a single
variable with +1 for one gender and -1 for the other. Extending to more than
two values is possible, but non-trivial.

Alternately, if you have Excel 2003 or later, you can create an indicator
variable (0 or 1) for each possible non-numeric value. This approach
directly permits more than 2 possible values.

Jerry
 
Can you explain the Excel 2003 or later indicator variables a little more? I
have four non-numerical values for race.

Thanks for the information on gender. I was using 1 for men and 2 for
females.
 
NG -

For four levels of a categorical variable, e.g., A or B or C or D, use three
indicator variables. Select one level as the base case, e.g., A, and the
value of each indicator variable (B, C, D) shows whether an observation is B
or not B, C or not C, etc. For an observation with level A, the value of all
three indicator variables is zero. The regression coefficients measure how
different B,C,D are from the base case A, on the average.

I use the same approach of gender, e.g., 0 for male and 1 for female, in
which case the regression coefficent for the gender indicator shows how
females differ from males, on the average.

- Mike
http://www.mikemiddleton.com
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Back
Top