histograms with normal distributions

P

P.-S. ROSS

I am trying to make nice histograms in MS Excel, without
using any plug-ins. I want a normal probability curve
(having the same mean and standard deviation than my raw
data) superimposed on the histogram, like normal
statistical software do automatically.

First I compute the absolute frequencies of the classes
(from the raw data) using the FREQUENCY function. The
relative frequencies are calculated by dividing the
absolute frequencies by the number of data (n). A bar
chart of the relative frequency vs. class middle point
yields the histogram. I know there is a "histogram"
function in the analysis tool that does that, but
nevermind, this is not my problem.

So now I need a normal probability curve. First I
calculate the mean and st. dev. of my raw data using the
appropriate functions (AVERAGE, STDEV). Then I use the
function NORMDIST(x,mean,standard_dev,cumulative), where x
is an arbitrary value for which I want the normal
probability, and "cumulative" is a logical operator set
to "FALSE" to return the probability mass function rather
than the cumulative function. I use enough x values in the
relevant range to get a smooth probability curve, and this
makes a nice graph, with an horizontal axis directly
comparable to that of the histogram. The vertical axis of
this second graph, however, is not directly comparable to
that of the histogram. So my problem is how to adjust the
results of the function NORMDIST so that the values are
directly comparable to the relative frequencies in my
histogram.

And, idealy, how to merge the histogram and the normal
probability curve in a single diagram, using only MS Excel
(Office 2000 version).
 
A

acw

Hi

Have you tried putting the normal curve on the second axis
on the same graph. Excel will then correct for the
differences in the vertical axis.


Tony
 
P

P.-S. ROSS

-----Original Message-----
"Have you tried putting the normal curve on the second
axis on the same graph. Excel will then correct for the
differences in the vertical axis."

-----Reply-----
yes, but then the horizontal axis is a category axis
rather than the numerical axis for the normal distribution
(bell curve). Since the histogram has 4-5 categories,
whereas the bell curve has many more (they are not
categories, but are plotted that way), it does not work.

My main problem is to generate a normal curve with a
correct VERTICAL axis, equivalent to relative frequencies
of the histogram, and having the same mean and st. dev.
than the my raw data. I can then combine the two graphs
outside Excel.
 
M

Michael R Middleton

P.-S. ROSS -
... The vertical axis of this second graph, however, is not directly
comparable to that of the histogram. <

Use a secondary axis for the second series.
And, idealy, how to merge the histogram and the normal probability curve
in a single diagram, using only MS Excel (Office 2000 version). <

Use a combination chart (Column chart type for vertical bars and XY
(Scatter) chart type for the curve).

- Mike Middleton, www.usfca.edu/~middleton
 
G

GB

P.-S. ROSS said:
-----Original Message-----
"Have you tried putting the normal curve on the second
axis on the same graph. Excel will then correct for the
differences in the vertical axis."

-----Reply-----
yes, but then the horizontal axis is a category axis
rather than the numerical axis for the normal distribution
(bell curve). Since the histogram has 4-5 categories,
whereas the bell curve has many more (they are not
categories, but are plotted that way), it does not work.

My main problem is to generate a normal curve with a
correct VERTICAL axis, equivalent to relative frequencies
of the histogram, and having the same mean and st. dev.
than the my raw data. I can then combine the two graphs
outside Excel.

I think that you are asking a statistical question, rather than an Excel
question. Your question is not about how to plot the bell curve, but what
factor to multiply the probability mass function figures by. Is that right?

The problem is that the frequency of the bars in the histogram depends on
the number of categories that you divide the data into. Imagine that you had
lots of data and you plotted it first with five categories in your
histogram. The median category might have a frequency of 0.4 or thereabouts.
Now plot it again with 10 (smaller) categories and the median frequency will
go down to around 0.2. Plot it again with 100 categories and that becomes
0.02.

I think that you should take a practical view and I would suggest that you
multiply the probability mass function figures by
<median frquency>/NORMDIST(average,average,std,false).
average and std are the figures from your data.

There may be better ways of fitting all the data, but this should provide a
reasonable result especially as I get the impression that you don't have a
lot of data points.

Regards

Geoff
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top