LINEST Issue - forced crossing zero R2 is wrong?




I am having real trouble with some simple linear regressions in
Excel. I would be excellent if someone could help.

I have various sets of data (one independent variable) that I am
trying to calculate R-squared for, with the Y intercept forced to 0
and without. When i dont have the intercept forced through zero all
my results seem to be acceptable. When I force crossing zero all my R-
squared values are very near 1. I would expect them to all get
slightly worse (most of the data is positively linearly correlated.)

For example for one set of data I get 0.522 for my R-Squared from
Linest with const as true (not forcing zero.) When I change the const
to zero (force through zero) I get 0.984. This is clearly wrong. I
think it should be about 0.49.

Has anyone else seen this? I am aware that the regression stats on
graphs in excel are incorrect but I though LINEST was ok. I am using
Excel 2003.





Jerry W. Lewis

To clarify, in Excel 2003 with intercept forced to zero

LINEST is correct

ATP regression tool is correct except for adjusted R2, which should be
where p is the number of parameters in the model (1 for simple linear
regression with no intercept). ATP incorrectly uses 1-(1-R2)*(n-1)/(n-1-p)

The chart trendline R2 is wrong.




Jerry W. Lewis

Your intuition is slightly off in the case of whether regression with or
without an intercept. The more nested terms you add to the model, the larger
R2 gets, as you expect. But in going from no intercept to having an
intercept, the models are not nested.

With an intercept, R2 gives the proportion of variability explained by the
model beyond what is explained by fitting a simple mean to all the data.

Without an intercept, R2 gives the proportion of variability explained by
the model beyond what is explained by forcing that mean to be zero.

A forced mean of zero will usually explain much less than a fitted mean, so
R2 with a forced zero intercept will usually be larger than an unrestricted
R2. For example,
The data (n=3) is
y x
3 4
4 5
4 6
with a NIST certified correct R2 of 0.9933... with no intercept, vs. 0.75
with an intercept.

In Excel 2003,
correctly returns the NIST certified value, where earlier versions of Excel
did not.


Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question