# Regression Leverage Formula (Jerry W. Lewis or Mike Middleton)already have DFITS formula

P

#### PJ

I have been trying to figure out the leverage formula for a set of X Y
values
X Y
75 16
83 20
85 25
85 27
92 32
97 48
99 48

regression gives me Predicted Y, Residuals and Standard Residuals, If
my data was in A2:B8 (X and Y) and C2 housed this formula =SLOPE(\$B
\$2:\$B\$8,\$A\$2:\$A\$8)*A2+INTERCEPT(\$B\$2:\$B\$8,\$A\$2:\$A\$8) which gives me my
predicted y and D2 housed B2-C2 which gives me my residuals then E2
housed =D2*D2 which gives me my residuals squared. What I am looking
for in cell F2 is HI1(the weight of the leverage value (y value) for
the ith residual). I have the values for the Leverage but do not know
how they were derived.

F2:F8
0.53588
0.200997
0.163787
0.163787
0.180066
0.331229
0.424252

I have the DFITS formula in G2:
=D2*SQRT((7-2-1)/(SUM(\$E\$2:\$E\$8)*(1-F2)-E2))*SQRT(F2/(1-F2))

But do not know how they derived the Leverage Formula?
Can you please give me some guidance?

M

#### Mike Middleton

PJ -

Based on a Google search for "regression leverage" (without the quotes) and
the first item "Leverage and Influential points" at
http://www.stat.rutgers.edu/~saral/pdf/563/Leverages-influential.pdf, the
desired values in your column F are equal to the h values shown on pages 17
and 18 of that document.

(An alternative equivalent formula for h is shown on page 12 as h = (1/n) +
p, where p is defined on page 2 of the document.)

For your data set, the variance of x is VARP(A2:A8) =61.42857.

Sum of squared deviations of x is n*VARP(A2:A8) = 430.

Xbar = 88, and for the first observation, x = 75, the deviation is -13 and
the squared deviation is 169.

The intermediate value p = squared deviation divided by sum of squared
deviations, so, for the first observation, p = 169/430 = 0.39302.

For the first observation, h = (1/n) + p = (1/7) + 0.39302 = 0.14286 +
0.39302 = 0.53588.

- Mike
http://www.MikeMiddleton.com

P

#### PJ

PJ  -

Based on a Google search for "regression leverage" (without the quotes) and
the first item "Leverage and Influential points" athttp://www.stat.rutgers.edu/~saral/pdf/563/Leverages-influential.pdf, the
desired values in your column F are equal to the h values shown on pages 17
and 18 of that document.

(An alternative equivalent formula for h is shown on page 12 as h = (1/n) +
p, where p is defined on page 2 of the document.)

For your data set, the variance of x is VARP(A2:A8) =61.42857.

Sum of squared deviations of x is n*VARP(A2:A8) = 430.

Xbar = 88, and for the first observation, x = 75, the deviation is -13 and
the squared deviation is 169.

The intermediate value p = squared deviation divided by sum of squared
deviations, so, for the first observation, p = 169/430 = 0.39302.

For the first observation, h = (1/n) + p = (1/7) + 0.39302 = 0.14286 +
0.39302 = 0.53588.

-  Mikehttp://www.MikeMiddleton.com

- Show quoted text -

Mike,

Thank you so much for your guidance and the link. Much appreciated.