Outlier analysis in excel - trimming means


S

Shane Lindsay

I would like to use excel for outlier analysis removal - to calculate a
mean of a range but not include or change values that are 2 or 3
standard deviations above or below the mean. Excel has a trimmedmean
function but outlier removal based on the above criteria is considered
better for normally distributed data.

I found these formula posted before on this forum. They don't include
values 2 SD above or below the mean. And they work!

=AVERAGE(IF((ABS(rng-AVERAGE(rng)))>(2*STDEV(rng)), "", rng))

=TRIMMEAN(Rng,COUNTIF(Rng,">"&(AVERAGE(Rng)+2*STDEVP(Rng)))/COUNT(Rng))

I also need to know how many data points were "removed", and then
report that as a percentage, and couldn't figure out to that (bearing
in mind they could be missing data points).

An alternative method of outlier analysis is to:
a. replace outliers with the 2 standard deviations above or below value
b. replace outliers with the mean

Any suggestions on how to do these things would be gratefully received.
 
Ad

Advertisements

L

Lori

If you are assuming normally distributed data - generally a good
approximation in large samples - you could use the ZTEST function to
give the two tailed P-value of the datapoints. Assuming data is down a
column you could fill down:

=ZTEST(Rng,Rng)

and filter out anything below 2.5% or above 97.5%, this should give
equivalent results to the methods above (actually this is 1.96 sigma,
NORMSDIST(2)=0.977, NORMSDIST(3)=0.999).
 
Ad

Advertisements

L

Lori

Correction: instead of ZTEST use:

=NORMDIST(Rng,AVERAGE(Rng),STDEV(Rng),1)

(I just noticed ZTEST divides the standard deviation by the square root
of the number of observaions which is used for testing sample means not
observations)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top