how do I identify statistical outliers with excel?

J

joeu2004

Jeff said:
Subject: how do I identify statistical outliers with excel?

You start by putting your inquiry in the __body__ of your posting, not
just in the subject line. Some news readers -- notably
http://www.microsoft.com/communities/newsgroups/en-us/default.aspx --
do not show the entire subject line, and they do not permit posting
responses when the body of the posting is empty.

I will offer some answers to your question if you promise not to use
this information to blindly and/or automatically exclude outliers from
your data. There are many reasons for deprecating automatic exclusion;
some people even deprecate excluding outliers manually. The primary
reason for excluding outliers is if you believe they represent data
entry errors. But some people argue that even that is not a good
reason, pointing to some unusual historical anecdotes.

Two numerical methods are commonly used to identify outliers: (1)
relying on the interquartile range (IQR); and (2) relying on the
standard deviation (SD). I prefer #1; #2 seems valid only when the
distribution is known to be "normal". In either case, it is best to
graph the data to confirm that the data are truly outliers in a
subjective sense. Often, "outliers" are merely indicative of extreme
skew in the distribution.

One way to graph the data in Excel is to select the data, then click on
the Chart Wizard on the toolbar and select the XY (Scatter) chart type.
Some writers suggest using histograms for this purpose. I disagree
because the shape of a histogram is greatly affected by the width of
the bars with respect to the data distribution, which is an art and
error-prone.

One definition of an outlier is that it is outside"x" times the
interquartile range from first or third quartile. "X" is typically 1.5
(mild) or 3 (extreme). I prefer 3 because it is more conservative. In
Excel, the following compute the first and third quartiles in A1 and A2
and the interquartile range (IQR) in B1 for the data in D1:D100, for
example:

A1: =QUARTILE(D1:D100,1)
A2: =QUARTILE(D1:D100,3)
B1: =A2 - A1

Then the following determines if D1 is an "extreme" outlier:

=if(or(D1 < A1 - 3*B1, D1 > A2 + 3*B1), "outlier?", "")

Another definition of an outlier is that it is outside "x" standard
deviations (SD) from the average. "X" is typically 3 or 4. I would
prefer 4 because it is more consistent with the definition of "extreme
outlier" above. In Excel, if A1 contains =AVERAGE(D1:D100) and B1
contains =STDEV(D1:D100), for example, the following determines if D1
is an outlier:

=if(or(D1 < A1 - 4*B1, D1 > A1 + 4*B1), "outlier?", "")
 
J

joeu2004

Don said:
And you should reply to the post instead of starting a new one.

And exactly how would you propose that I accomplish that when "some
news readers" -- you might imagine that means the one(s) that I use --
"do not permit posting responses when the body of the posting is
empty"? Klunk!

For the reading-impaired, let me expand on my explanation. I found the
original posting only by going to
http://www.microsoft.com/communities/newsgroups/en-us/default.aspx .
It does not appear in Google Groups. (That is a common problem, not
limited to "empty" postings.) But the MS web site does not permit
posting a response to "empty" postings.

So my choice was not to respond at all or post a new thread. I chose
not only to help the OP, but also to educate readers about why they
might not see responses to their "empty" postings.

PS: If I duplicated the title exactly, most news readers with
"threading" capability will make my posting look like a response, not a
new thread, if they have the original posting. However, I must confess
that I do not know if I duplicated the title exactly because the MS web
site does not show the entire title. The most that I can see is "how
do I identify statistical outliers with e...". In fact, viewing the
original posting another way, I just discovered that the title is at
least "how do I identify statistical outliers with excel in". I
suspect there is more after the word "in".
 
J

Jon Peltier

For the newsreader-impaired, the MS web interface is not the best newsreader
you can find (as you've clearly documented), even though it's how MS funnels
new users to the forums via Help and via their web site. Even Outlook
Express is better, and it allows you to view and reply to any post. Here is
Chip Pearson's article about linking to the newsgroups using an NNTP
newsreader (such as OE, Netscape, Agent, and numerous others):

http://cpearson.com/excel/DirectConnect.htm

- Jon
 
J

joeu2004

Jon said:
For the newsreader-impaired, the MS web interface is not the best newsreader

You get no argument from me. I use it only as a last resort and only
because I have not set up any other news reader on my computer.

PS: I said:
If I duplicated the title exactly, most news readers with
"threading" capability will make my posting look like a response [....].
[.... V]iewing the original posting another way, I just discovered that
the title is at least "how do I identify statistical outliers with excel in". I
suspect there is more after the word "in".

My mistake. In my rush to get off the computer earlier, I misread the
UI. The word "in" is not part of the original title. However, the
original title does not have "?" after the word "excel". My bad!
 
D

Don Guillett

In Outlook Express
tools>accounts>news>add>news>msnews.microsoft.com

--
Don Guillett
SalesAid Software
(e-mail address removed)
Jon said:
For the newsreader-impaired, the MS web interface is not the best
newsreader

You get no argument from me. I use it only as a last resort and only
because I have not set up any other news reader on my computer.

PS: I said:
If I duplicated the title exactly, most news readers with
"threading" capability will make my posting look like a response [....].
[.... V]iewing the original posting another way, I just discovered that
the title is at least "how do I identify statistical outliers with excel
in". I
suspect there is more after the word "in".

My mistake. In my rush to get off the computer earlier, I misread the
UI. The word "in" is not part of the original title. However, the
original title does not have "?" after the word "excel". My bad!
 
J

joeu2004

Don said:
You get no argument from me. I use it only as a last resort and only
because I have not set up any other news reader on my computer.
[....]
In Outlook Express
tools>accounts>news>add>news>msnews.microsoft.com

I did not say I do not know how to set it up. I simply said I did not
set it up. I do not like the OE news reader, nor the others mentioned.
I have my reasons for that.
 
D

Don Guillett

OK

--
Don Guillett
SalesAid Software
(e-mail address removed)
Don said:
Jon Peltier wrote:
For the newsreader-impaired, the MS web interface is not the best
newsreader

You get no argument from me. I use it only as a last resort and only
because I have not set up any other news reader on my computer.
[....]
In Outlook Express
tools>accounts>news>add>news>msnews.microsoft.com

I did not say I do not know how to set it up. I simply said I did not
set it up. I do not like the OE news reader, nor the others mentioned.
I have my reasons for that.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top