Standard deviation calculation error...

Guest · Oct 16, 2007

Hi,

i was reviewing my preset worksheet with the standard deviation function and
noticed that the out put result contains error.

Formula "=STDEV(K12:K111)" was set to to calculate standard deviation
between 0-100 sets of values in the column, where values can be whole numbers
or fractions, in respective columns. Formula was observed performing
perfectly with whole numbers, however, with fractions (or numbers with
decimal places), the output is invalid (in some sense).

You may test, with 60 sets of values "1" in a column, the STDEV is "0".
However, when applying the same formula to a fraction value "0.05", 60 sets
in a column, the STDEV is "4.9E-17". This is definitely a NO NO answer in
mathematics point of view. If you breakdown the calculation steps of a STDEV
formula "s2 = (âˆ‘(x-m)^2)/N" into multiple columns, it would be identified
that the culprit lies on the "âˆ‘(X-M)^2" formula, where x is the individual
value and m is the mean.

Example below demonstrates the breakdown of the column calculations:
x m N âˆ‘(x-m)2 s2 s
0.05 0.05 1 0 0 0
0.05 0.05 2 0 0 0
0.05 0.05 3 1.44E-34 4.81E-35 6.94E-18
0.05 0.05 4 0 0 0
0.05 0.05 5 0 0 0
0.05 0.05 6 2.89E-34 4.81E-35 6.94E-18
0.05 0.05 7 3.37E-34 4.81E-35 6.94E-18
:
:
0.05 0.05 58 1.01E-31 1.73E-33 4.16E-17
0.05 0.05 59 1.02E-31 1.73E-33 4.16E-17
0.05 0.05 60 1.42E-31 2.36E-33 4.86E-17

*Note that the âˆ‘(x-m)2 is calculated using array formula
"{=SUM(($A$2:A61-B61)^2)}", since mean is constantly changing.

Is this due to limitation of the excel formula (STDEV only works for whole
numbers)? Or is there any patch for this error?

Guest · Oct 16, 2007

Hi,

You may test, with 60 sets of values "1" in a column, the STDEV is "0".
However, when applying the same formula to a fraction value "0.05", 60 sets

i>n a column, the STDEV is "4.9E-17".

On my machine Excel calculates the standard deviation od 60 data points of
0.05 correctly as zero. It only appears as 4.9E-17 if the cell is formatted
as General. Format as number to get the correct answer.

Mike

JE McGimpsey · Oct 16, 2007

No, STDEV "works" for fractional as well as whole numbers, but you are
seeing the limitation of IEEE Double Precision Floating Point math,
which is used by XL and most other spreadsheets. There's no patch,
because XL can't determine whether the small rounding error is real or
an artifact.

Nearly all numbers cannot be exactly represented in a finite number of
binary digits, just as the number 1/3 cannot be represented in a finite
number of decimal digits. So you should expect small errors due to
rounding. You can use ROUND() to filter out these errors, e.g.,

=ROUND(STDEV(A2:A61),9)

See

http://cpearson.com/excel/rounding.aspx

for more details

JE McGimpsey · Oct 16, 2007

Changing format does nothing to the underlying value in the cell. Your
using the Number format simply is using the display engine to round the
value in the display.

It won't change the result, however:

B1: =STDEV(A2:A60)
B2: =B1=0

If you increase the number of displayed decimals, you'd find that you'd
display 0.000000000000000049

Guest · Oct 19, 2007

Hi Mike,

Thanks for the reply, i've also tested the setting. However, agreeable with
McGimpsey's reply (after yours), the result is still not zero. Is ok...accept
McGimpsey's sharing on the limitation...just have to use the rounding feature.

Thanks thought

Guest · Oct 19, 2007

Hi McGimpsey,

Thanks for the information shared, interesting to know one of the the
calculation backbone. Think will just have to use the ROUND function to close
the loop.

Thanks again.

Guest · Oct 21, 2007

While you are correct that this is a direct result of the limitations of IEEE
standard double precision (common to almost all general purpose software),
and the change in 2003 to a 2-pass algorithm for STDEV, VAR, etc is a welcome
numerical improvement, there is still room for improvement in Excel's
algorithms here. If Excel used updating algorithms
http://groups.google.com/group/microsoft.public.excel.misc/msg/4c6ee0c636ad016a
for AVERAGE, STDEV, COVAR, etc, then STDEV would always return zero for
constant data, GEOMEAN would never overflow unless individual observations
overflowed, ...

Jerry

JE McGimpsey · Oct 21, 2007

Thanks for the correction/amplification!

You're right - I should have said that the result is limited by the
combination of the inherent limits of finite precision representation
*and* the quality of implementation of the algorithm(s) involved.

Insert value based on range	3	Oct 29, 2008
Frequency distribution and descriptive statististics	6	Oct 22, 2009
WCG Stats Monday 21 August 2023	3	Aug 21, 2023
WCG Stats Thursday 24 August 2023	3	Aug 24, 2023
Calculating Sortino Ration (Downside Deviation)	9	Aug 26, 2008
Standard deviation of specific cells in a column	1	Apr 4, 2007
#N/A error	2	Nov 20, 2003
Find value, copy, compute, and more	5	Dec 3, 2004

Standard deviation calculation error...

Guest

Guest

JE McGimpsey

JE McGimpsey

Guest

Guest

Guest

JE McGimpsey

Ask a Question

Similar Threads