PC Review


Reply
Thread Tools Rate Thread

Correlation Coefficient Issue

 
 
Diane
Guest
Posts: n/a
 
      15th Dec 2004
I have a formula in cell C7, the formula is:
CORREL(B4:B6,C4:C6)

The cells have the follwing values:
B4: 1.90
B5: 1.90
B6: 1.90
C4: 2.13
C5: 1.83
C6: 3.02

Cell C7 returns: 0.000

Here's where the problem comes in.
The formula in cell C15 is: CORREL(B12:B14,C12:C14)
The cells have the follwing values:
B12: 9.50
B13: 9.50
B14: 9.50
C12: 9.56
C13: 10.26
C13: 10.16

Cell C15 returns the error: #DIV/0!

Why is the formula producing the #DIV/0! error in the
second set and not the first or vice versa?

ANY help would be greatly appreciated.
Thanks, Diane

 
Reply With Quote
 
 
 
 
Jerry W. Lewis
Guest
Posts: n/a
 
      15th Dec 2004
See Help for CORREL worksheet function. The denominator is the product
of standard deviations for x and y. The standard deviation of B12:B14
is zero ...

Format B4:B6 to show 14 decimal places. You will find that they are not
all equal, hence the standard deviation is small, but not zero.

Jerry

Diane wrote:

> I have a formula in cell C7, the formula is:
> CORREL(B4:B6,C4:C6)
>
> The cells have the follwing values:
> B4: 1.90
> B5: 1.90
> B6: 1.90
> C4: 2.13
> C5: 1.83
> C6: 3.02
>
> Cell C7 returns: 0.000
>
> Here's where the problem comes in.
> The formula in cell C15 is: CORREL(B12:B14,C12:C14)
> The cells have the follwing values:
> B12: 9.50
> B13: 9.50
> B14: 9.50
> C12: 9.56
> C13: 10.26
> C13: 10.16
>
> Cell C15 returns the error: #DIV/0!
>
> Why is the formula producing the #DIV/0! error in the
> second set and not the first or vice versa?
>
> ANY help would be greatly appreciated.
> Thanks, Diane


 
Reply With Quote
 
Diane
Guest
Posts: n/a
 
      15th Dec 2004
If I increase B4:B6 to show 14 decimal places, the values
will only be 1.90000000000000 - Why would that make a
difference?
>-----Original Message-----
>See Help for CORREL worksheet function. The denominator

is the product
>of standard deviations for x and y. The standard

deviation of B12:B14
>is zero ...
>
>Format B4:B6 to show 14 decimal places. You will find

that they are not
>all equal, hence the standard deviation is small, but

not zero.
>
>Jerry
>
>Diane wrote:
>
>> I have a formula in cell C7, the formula is:
>> CORREL(B4:B6,C4:C6)
>>
>> The cells have the follwing values:
>> B4: 1.90
>> B5: 1.90
>> B6: 1.90
>> C4: 2.13
>> C5: 1.83
>> C6: 3.02
>>
>> Cell C7 returns: 0.000
>>
>> Here's where the problem comes in.
>> The formula in cell C15 is: CORREL(B12:B14,C12:C14)
>> The cells have the follwing values:
>> B12: 9.50
>> B13: 9.50
>> B14: 9.50
>> C12: 9.56
>> C13: 10.26
>> C13: 10.16
>>
>> Cell C15 returns the error: #DIV/0!
>>
>> Why is the formula producing the #DIV/0! error in the
>> second set and not the first or vice versa?
>>
>> ANY help would be greatly appreciated.
>> Thanks, Diane

>
>.
>

 
Reply With Quote
 
Jerry W. Lewis
Guest
Posts: n/a
 
      16th Dec 2004
I presume then that B4:B6 are calculated results. Because of decimal/
binary conversions, etc. it is possible to have slight differences
between calculated results that cannot be displayed. If you will post
(text only, please, no attachments) the formulas and the contents of
the referenced cells, I will endeavor to explain why these slight
differences are not arithmetic errors. Meanwhile, you will find that
at least one of the following formulae are not zero
=(B4-B5)
=(B4-B6)
=(B5-B6)
The parentheses in these formulas are required to see the differences
when the values are equal to 15 decimal places.

In cases such as this, rounding the results in B4:B6 is the only way to
get the expected result.

Jerry

Diane wrote:

> If I increase B4:B6 to show 14 decimal places, the values
> will only be 1.90000000000000 - Why would that make a
> difference?
>
>>-----Original Message-----
>>See Help for CORREL worksheet function. The denominator
>>

> is the product
>
>>of standard deviations for x and y. The standard
>>

> deviation of B12:B14
>
>>is zero ...
>>
>>Format B4:B6 to show 14 decimal places. You will find
>>

> that they are not
>
>>all equal, hence the standard deviation is small, but
>>

> not zero.
>
>>Jerry
>>
>>Diane wrote:
>>
>>
>>>I have a formula in cell C7, the formula is:
>>>CORREL(B4:B6,C4:C6)
>>>
>>>The cells have the follwing values:
>>>B4: 1.90
>>>B5: 1.90
>>>B6: 1.90
>>>C4: 2.13
>>>C5: 1.83
>>>C6: 3.02
>>>
>>>Cell C7 returns: 0.000
>>>
>>>Here's where the problem comes in.
>>>The formula in cell C15 is: CORREL(B12:B14,C12:C14)
>>>The cells have the follwing values:
>>>B12: 9.50
>>>B13: 9.50
>>>B14: 9.50
>>>C12: 9.56
>>>C13: 10.26
>>>C13: 10.16
>>>
>>>Cell C15 returns the error: #DIV/0!
>>>
>>>Why is the formula producing the #DIV/0! error in the
>>>second set and not the first or vice versa?
>>>
>>>ANY help would be greatly appreciated.
>>>Thanks, Diane
>>>

>>.
>>
>>


 
Reply With Quote
 
Jerry W. Lewis
Guest
Posts: n/a
 
      16th Dec 2004
As a possibly useful preview, enter 1.9 into A15
In A14 enter the formula
=A15+2^-51
In A15 enter the formula
=A15-2^-51
Copy cell A14 and paste over A1:A13
Copy cell A16 and paste over A17:A30
Format A1:A16 as numeric with 15 decimal places

This illustrates that (as documented) Excel will only display 15 digits
(14 decimal places with these numbers), yet there are 23 distinct binary
numbers that all display as 1.90000000000000

In your case your cell formulas resulted in at least two of these 23
distinct values, hence the resulting nonzero correlation.

Jerry

Jerry W. Lewis wrote:

> I presume then that B4:B6 are calculated results. Because of decimal/
> binary conversions, etc. it is possible to have slight differences
> between calculated results that cannot be displayed. If you will post
> (text only, please, no attachments) the formulas and the contents of
> the referenced cells, I will endeavor to explain why these slight
> differences are not arithmetic errors. Meanwhile, you will find that
> at least one of the following formulae are not zero
> =(B4-B5)
> =(B4-B6)
> =(B5-B6)
> The parentheses in these formulas are required to see the differences
> when the values are equal to 15 decimal places.
>
> In cases such as this, rounding the results in B4:B6 is the only way to
> get the expected result.
>
> Jerry
>
> Diane wrote:
>
>> If I increase B4:B6 to show 14 decimal places, the values will only be
>> 1.90000000000000 - Why would that make a difference?
>>
>>> -----Original Message-----
>>> See Help for CORREL worksheet function. The denominator

>>
>> is the product
>>
>>> of standard deviations for x and y. The standard

>>
>> deviation of B12:B14
>>
>>> is zero ...
>>>
>>> Format B4:B6 to show 14 decimal places. You will find

>>
>> that they are not
>>
>>> all equal, hence the standard deviation is small, but

>>
>> not zero.
>>
>>> Jerry
>>>
>>> Diane wrote:
>>>
>>>
>>>> I have a formula in cell C7, the formula is: CORREL(B4:B6,C4:C6)
>>>>
>>>> The cells have the follwing values:
>>>> B4: 1.90
>>>> B5: 1.90
>>>> B6: 1.90
>>>> C4: 2.13
>>>> C5: 1.83
>>>> C6: 3.02
>>>>
>>>> Cell C7 returns: 0.000
>>>>
>>>> Here's where the problem comes in.
>>>> The formula in cell C15 is: CORREL(B12:B14,C12:C14)
>>>> The cells have the follwing values:
>>>> B12: 9.50
>>>> B13: 9.50
>>>> B14: 9.50
>>>> C12: 9.56
>>>> C13: 10.26
>>>> C13: 10.16
>>>>
>>>> Cell C15 returns the error: #DIV/0!
>>>>
>>>> Why is the formula producing the #DIV/0! error in the second set and
>>>> not the first or vice versa?
>>>>
>>>> ANY help would be greatly appreciated.
>>>> Thanks, Diane


 
Reply With Quote
 
Diane
Guest
Posts: n/a
 
      16th Dec 2004
Thanks for your help - Below are the values:

The following are manual entries:
B4: 1.90
B5: 1.90
B6: 1.90

The following are manual entries:
C4: 2.13
C5: 1.83
C6: 3.02

Formula in C7:
=CORREL(B4:B6,C4:C6)
The above returns 0.000

The following are manual entries:
B12: 9.50
B13: 9.50
B14: 9.50

The following are manual entries:
C12: 9.56
C13: 10.26
C14: 10.16

Formula in C15:
=CORREL(B12:B14,C12:C14)
The above returns #DIV/0!

Thank you again for your help


>-----Original Message-----
>I presume then that B4:B6 are calculated results.

Because of decimal/
>binary conversions, etc. it is possible to have slight

differences
>between calculated results that cannot be displayed. If

you will post
>(text only, please, no attachments) the formulas and the

contents of
>the referenced cells, I will endeavor to explain why

these slight
>differences are not arithmetic errors. Meanwhile, you

will find that
>at least one of the following formulae are not zero
> =(B4-B5)
> =(B4-B6)
> =(B5-B6)
>The parentheses in these formulas are required to see

the differences
>when the values are equal to 15 decimal places.
>
>In cases such as this, rounding the results in B4:B6 is

the only way to
>get the expected result.
>
>Jerry
>
>Diane wrote:
>
>> If I increase B4:B6 to show 14 decimal places, the

values
>> will only be 1.90000000000000 - Why would that make a
>> difference?
>>
>>>-----Original Message-----
>>>See Help for CORREL worksheet function. The

denominator
>>>

>> is the product
>>
>>>of standard deviations for x and y. The standard
>>>

>> deviation of B12:B14
>>
>>>is zero ...
>>>
>>>Format B4:B6 to show 14 decimal places. You will find
>>>

>> that they are not
>>
>>>all equal, hence the standard deviation is small, but
>>>

>> not zero.
>>
>>>Jerry
>>>
>>>Diane wrote:
>>>
>>>
>>>>I have a formula in cell C7, the formula is:
>>>>CORREL(B4:B6,C4:C6)
>>>>
>>>>The cells have the follwing values:
>>>>B4: 1.90
>>>>B5: 1.90
>>>>B6: 1.90
>>>>C4: 2.13
>>>>C5: 1.83
>>>>C6: 3.02
>>>>
>>>>Cell C7 returns: 0.000
>>>>
>>>>Here's where the problem comes in.
>>>>The formula in cell C15 is: CORREL(B12:B14,C12:C14)
>>>>The cells have the follwing values:
>>>>B12: 9.50
>>>>B13: 9.50
>>>>B14: 9.50
>>>>C12: 9.56
>>>>C13: 10.26
>>>>C13: 10.16
>>>>
>>>>Cell C15 returns the error: #DIV/0!
>>>>
>>>>Why is the formula producing the #DIV/0! error in the
>>>>second set and not the first or vice versa?
>>>>
>>>>ANY help would be greatly appreciated.
>>>>Thanks, Diane
>>>>
>>>.
>>>
>>>

>
>.
>

 
Reply With Quote
 
hrlngrv - ExcelForums.com
Guest
Posts: n/a
 
      17th Dec 2004
Diane wrote..
>Thanks for your help - Below are the values


>The following are manual entries
>B4: 1.9
>B5: 1.9
>B6: 1.9


>The following are manual entries
>C4: 2.1
>C5: 1.8
>C6: 3.0


>Formula in C7
>=CORREL(B4:B6,C4:C6
>The above returns 0.00

..

This is an error. FWIW, STDEV(B4:B6) returns 2.98023E-08 in XL97SR-2
but that's wrong because the standard deviation of a set of an
number of instances of the same number should be zero

This appears to be nothing more than one of the nastier instances o
Excel's poor numerical implementation in older versions

However, in all instances, if B4:B6 has zero variance, so zer
standard deviation, then it's correlation with any other set o
values is undefined. Putting this another way, there's no such thin
as correlation between constants and random variables or betwee
different constants
--------
www.coffeecozy.co

Use your Bodum and give up cold coffee for good!
 
Reply With Quote
 
Jerry W. Lewis
Guest
Posts: n/a
 
      17th Dec 2004
Per Help for CORREL and DEVSQ,
=CORREL(B4:B6,C4:C6)
is equivalent to
=COVAR(B4:B6,C4:C6) / ( SQRT(DEVSQ(B4:B6)/COUNT(B4:B6)) *
SQRT(DEVSQ(C4:C6)/COUNT(C4:C6)) )

=DEVSQ(B12:B14) is zero but =DEVSQ(B4:B6) is not zero, hence the
difference in CORREL results.

=DEVSQ(B4:B6) is not zero because =(B4-AVERAGE(B4:B6)) is not zero.

That can happen when all of the values are identical, because computer
math is done to finite precision. Imagine that you had a decimal
computer that carried only 4 figures in calculations. In that case,
SUM(0.4444,0.4444,0.4444) would be 1.333 since 1.3332 would require a
5th figure. Thus AVERAGE(0.4444,0.4444,0.4444) would be 0.4443 instead
of 0.4444 and hence (0.4444-AVERAGE(0.4444,0.4444,0.4444)) would not be
zero.

You are probably wondering how this is applicable, when your repeated
number is 1.9 since Excel advertises 15 digit precision. The reason is
that Excel (and almost all other general purpose hardware and software)
does binary math, not decimal math. Most decimal fractions (including
1.9) is a non-terminating binary fraction, just as 1/3 is a
non-terminating decimal fraction. The trailing bit pattern in the
binary approximation to 1.9 is such that there is lost information when
you add 3 copies of 1.9 (crossing 2 powers of 2) just as in my
hypothetical example. If you chase through the details of the IEEE 754
standard, you will find that =(B4-AVERAGE(B4:B6)) should equal
1/4503599627370496 which is 2.220446049250313080847263336181640625E-16.
Excel displays that difference as 2.22044604925031E-16, indicating that
it is performing the arithmetic correctly.

9.5 is exactly representable in binary (an integer plus a power of 2),
hence finite precision arithmetic matches the infinite precision
arithmetic that you would mentally do.

Jerry

Diane wrote:

> Thanks for your help - Below are the values:
>
> The following are manual entries:
> B4: 1.90
> B5: 1.90
> B6: 1.90
>
> The following are manual entries:
> C4: 2.13
> C5: 1.83
> C6: 3.02
>
> Formula in C7:
> =CORREL(B4:B6,C4:C6)
> The above returns 0.000
>
> The following are manual entries:
> B12: 9.50
> B13: 9.50
> B14: 9.50
>
> The following are manual entries:
> C12: 9.56
> C13: 10.26
> C14: 10.16
>
> Formula in C15:
> =CORREL(B12:B14,C12:C14)
> The above returns #DIV/0!
>
> Thank you again for your help
>
>
>
>>-----Original Message-----
>>I presume then that B4:B6 are calculated results.
>>

> Because of decimal/
>
>>binary conversions, etc. it is possible to have slight
>>

> differences
>
>>between calculated results that cannot be displayed. If
>>

> you will post
>
>>(text only, please, no attachments) the formulas and the
>>

> contents of
>
>>the referenced cells, I will endeavor to explain why
>>

> these slight
>
>>differences are not arithmetic errors. Meanwhile, you
>>

> will find that
>
>>at least one of the following formulae are not zero
>> =(B4-B5)
>> =(B4-B6)
>> =(B5-B6)
>>The parentheses in these formulas are required to see
>>

> the differences
>
>>when the values are equal to 15 decimal places.
>>
>>In cases such as this, rounding the results in B4:B6 is
>>

> the only way to
>
>>get the expected result.
>>
>>Jerry
>>
>>Diane wrote:
>>
>>
>>>If I increase B4:B6 to show 14 decimal places, the
>>>

> values
>
>>>will only be 1.90000000000000 - Why would that make a
>>>difference?
>>>
>>>
>>>>-----Original Message-----
>>>>See Help for CORREL worksheet function. The
>>>>

> denominator
>
>>>is the product
>>>
>>>
>>>>of standard deviations for x and y. The standard
>>>>
>>>>
>>>deviation of B12:B14
>>>
>>>
>>>>is zero ...
>>>>
>>>>Format B4:B6 to show 14 decimal places. You will find
>>>>
>>>>
>>>that they are not
>>>
>>>
>>>>all equal, hence the standard deviation is small, but
>>>>
>>>>
>>>not zero.
>>>
>>>
>>>>Jerry
>>>>
>>>>Diane wrote:
>>>>
>>>>
>>>>
>>>>>I have a formula in cell C7, the formula is:
>>>>>CORREL(B4:B6,C4:C6)
>>>>>
>>>>>The cells have the follwing values:
>>>>>B4: 1.90
>>>>>B5: 1.90
>>>>>B6: 1.90
>>>>>C4: 2.13
>>>>>C5: 1.83
>>>>>C6: 3.02
>>>>>
>>>>>Cell C7 returns: 0.000
>>>>>
>>>>>Here's where the problem comes in.
>>>>>The formula in cell C15 is: CORREL(B12:B14,C12:C14)
>>>>>The cells have the follwing values:
>>>>>B12: 9.50
>>>>>B13: 9.50
>>>>>B14: 9.50
>>>>>C12: 9.56
>>>>>C13: 10.26
>>>>>C13: 10.16
>>>>>
>>>>>Cell C15 returns the error: #DIV/0!
>>>>>
>>>>>Why is the formula producing the #DIV/0! error in the
>>>>>second set and not the first or vice versa?
>>>>>
>>>>>ANY help would be greatly appreciated.
>>>>>Thanks, Diane
>>>>>
>>>>>
>>>>.
>>>>
>>>>
>>>>

>>.
>>
>>


 
Reply With Quote
 
Jerry W. Lewis
Guest
Posts: n/a
 
      17th Dec 2004
CORREL() calculates like DEVSQ(), not like pre-2003 STDEV(). As noted
in another post in this thread, =(B4-AVERAGE(B4:B6)) is not zero. That
is why I prefer updating algorithms to two pass alogorithms,
http://groups.google.com/groups?selm...0no_e-mail.com
which would have no truncation error in either case.

Jerry

hrlngrv - ExcelForums.com wrote:

> Diane wrote...
>
>>Thanks for your help - Below are the values:
>>
>>The following are manual entries:
>>B4: 1.90
>>B5: 1.90
>>B6: 1.90
>>
>>The following are manual entries:
>>C4: 2.13
>>C5: 1.83
>>C6: 3.02
>>
>>Formula in C7:
>>=CORREL(B4:B6,C4:C6)
>>The above returns 0.000
>>

> ...
>
> This is an error. FWIW, STDEV(B4:B6) returns 2.98023E-08 in XL97SR-2,
> but that's wrong because the standard deviation of a set of any
> number of instances of the same number should be zero.
>
> This appears to be nothing more than one of the nastier instances of
> Excel's poor numerical implementation in older versions.
>
> However, in all instances, if B4:B6 has zero variance, so zero
> standard deviation, then it's correlation with any other set of
> values is undefined. Putting this another way, there's no such thing
> as correlation between constants and random variables or between
> different constants.


 
Reply With Quote
 
 
 
Reply

Thread Tools
Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Correlation-Coefficient daniel chen Microsoft Excel Misc 3 31st Jan 2006 07:59 AM
spearman correlation coefficient =?Utf-8?B?TWFyY2VsIExhYm9udMOp?= Microsoft Excel Misc 1 12th May 2005 04:00 AM
Correlation coefficient (r) Phil C Microsoft Excel Charting 3 14th Aug 2004 04:27 AM
Correlation coefficient for log plot.. Phil C Microsoft Excel Charting 2 6th Aug 2004 08:34 AM
correlation coefficient when I have totals for each data pair =?Utf-8?B?ZGFu?= Microsoft Excel Misc 1 15th Mar 2004 05:07 AM


Features
 

Advertising
 

Newsgroups
 


All times are GMT +1. The time now is 11:34 PM.