Outlier analysis in excel - trimming means

  • Thread starter Thread starter Shane Lindsay
  • Start date Start date
S

Shane Lindsay

I would like to use excel for outlier analysis removal - to calculate a
mean of a range but not include or change values that are 2 or 3
standard deviations above or below the mean. Excel has a trimmedmean
function but outlier removal based on the above criteria is considered
better for normally distributed data.

I found these formula posted before on this forum. They don't include
values 2 SD above or below the mean. And they work!

=AVERAGE(IF((ABS(rng-AVERAGE(rng)))>(2*STDEV(rng)), "", rng))

=TRIMMEAN(Rng,COUNTIF(Rng,">"&(AVERAGE(Rng)+2*STDEVP(Rng)))/COUNT(Rng))

I also need to know how many data points were "removed", and then
report that as a percentage, and couldn't figure out to that (bearing
in mind they could be missing data points).

An alternative method of outlier analysis is to:
a. replace outliers with the 2 standard deviations above or below value
b. replace outliers with the mean

Any suggestions on how to do these things would be gratefully received.
 
If you are assuming normally distributed data - generally a good
approximation in large samples - you could use the ZTEST function to
give the two tailed P-value of the datapoints. Assuming data is down a
column you could fill down:

=ZTEST(Rng,Rng)

and filter out anything below 2.5% or above 97.5%, this should give
equivalent results to the methods above (actually this is 1.96 sigma,
NORMSDIST(2)=0.977, NORMSDIST(3)=0.999).
 
Correction: instead of ZTEST use:

=NORMDIST(Rng,AVERAGE(Rng),STDEV(Rng),1)

(I just noticed ZTEST divides the standard deviation by the square root
of the number of observaions which is used for testing sample means not
observations)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Back
Top