Confidence / prediction intervals in multiple linear regression

Jon · May 18, 2007

I am trying to find some vba code / method of calculating the
confidence intervals for the mean of the dependent y and the
prediction interval for the dependent y in multiple linear regression
models. I want to calculate these in excel rather than a stats
package. Any code, ideas or suggestions for any online groups that
are more stats based would be welcome. Thanks.

Guest · May 18, 2007

Excel does not natively provide all the information necessary in a convenient
form. Therefore how much YOU understand of statistics is more to the point.
Absent information about the specific multiple regression model you want to
fit, I will have to assume that you understand basic linear algebra as well.
Sorry in advance if this is too heavy to slog through.

You will need to calculate the variance of an estimated point on the
multiple regression. Suppose X is the "design matrix" (the array that you
would pass to LINEST, augmented with a column of ones [unless you are not
fitting a constant term]). If b is the corresponding vector of coefficient
estimates (a column, and in reverse order to the LINEST output), then
=MMULT(X,b) gives the estimated multiple regression at your data points, i.e.
the same output as =TREND(known_y's,known_x's,,const). The predicted value
at a given point on the multiple regression would be =MMULT(v,b) where v is
the row of X corresponding to the point (if it is in the data set), or is
constructed similarly (if it is not in the data set). You will need D which
is calculated as
=MMULT(MMULT(v,MINVERSE(TRANSPOSE(X),X),TRANSPOSE(v))
The variance of an estimated point on the multiple regression is then
=D*MSE, where MSE is =sey^2 and sey is one of the quantities output by
=LINEST(known_y's,known_x's,const,TRUE). Similarly, the variance of a
predicted future point that follows the same multiple regression is
=(1+D)*MSE.

A 95% 2-sided confidence interval for a point on the multiple regression
line is then
=MMULT(v,b) +/- SQRT(D*MSE)*TINV(0.05,df)
A 95% 2-sided prediction interval for a new point that follows the multiple
regression is
=MMULT(v,b) +/- SQRT((1+D)*MSE)*TINV(0.05,df)

Some simplification may be possible given knowledge of the particular
regression model that you want. For instance, with simple linear regression:
MSE reduces to STEYX(known_y's,known_x's)
=MMULT(v,b) reduces to
=FORECAST(x,known_y's,known_x's)
or equivalently to
=INTERCEPT(known_y's,known_x's) +x*SLOPE(known_y's,known_x's)
and D reduces to
=1/COUNT(known_x's) +(x-AVERAGE(known_x's))^2/DEVSQ(known_x's)

Jerry

COnfidence Intervals in Multiple Regression	4	Sep 3, 2009
Weighted Linear Regression	2	Jul 28, 2008
Multiple Regression in Analysis Tool Pack - Confidence Intervals	1	Sep 3, 2009
Multiple (as in many) regressions	3	Dec 22, 2011
Having trouble using LINEST in Excel 2007 VBA	1	Feb 23, 2010
How to print LinEst result to a spreadsheet using VBA	3	Mar 3, 2009
Linear Regression?	5	Mar 30, 2004
Custom function for max residual from linear regression	7	Jul 26, 2006

Confidence / prediction intervals in multiple linear regression

Jon

Guest

Ask a Question

Similar Threads