25th percentile error

  • Thread starter Thread starter Ank the Tank
  • Start date Start date
Hello,

Which version are you using?

I get 216.75 with Excel 2003.

Which is what you think is correct...

Regards,
Bernd
 
I'm using Excel 2007, and it calculates it out to be 216.645, when it
should be 0.

Bernie, thank you for the link, I will check it out.
 
I don't know if anyone has run into this problem yet or not, but I
took a list with a bunch of 0, and Excel did not calculate the 25th
percentile correctly. Here is the list of numbers, and the result that
excel gave, along with what Google Spreadsheet says:http://spreadsheets.google.com/pub?key=pAx-2J0plOyvU7reZfyFGNA

Any ideas?

I have tried this in Excel 2000 and 2007, Open Office, Gnumeric, and
123.

All gave a result of 216.75

This link:
http://www.itl.nist.gov/div898/handbook/prc/section2/prc252.htm

shows why the answer is 216.75 and not 0, using the percentile
definition used by all the spreadsheets I looked at.

216.75 is the value of the 6th item in the list + 0.25 * (7th - 6th) =
0 + 0.25 * 867
 
Are you certain that the values are as you reported, rather than formatted
values with hidden non-zero decimal values? Alternately, you could try the
array formula
=PERCENTILE(ROUND(data,0),0.25)
entered with Ctrl-Shift-Enter to ensure that you are actually using the
posted data. The discrepancy between your reported value of 216.645 and
Doug's reported value of 216.75 for the reported data in Excel 2007 is easily
within the uncertainty that could have been introduced by formatting. The
basis is not clear for your expectation in the follow-up post that the 25th
percentile should be zero.

AFAIK all versions of Excel (at least since v4) have used the same algorithm
for Percentile, which should give 216.75 for the reported data. Note however
that there is no universally accepted definition of a percentile from sample
data. Hyndman and Fan, 1996, "Sample Quantiles in Statistical Packages", The
American Statistician 50(4):361-365 discuss 9 different definitions and
reference some others. Excel uses Hyndman and Fan's 7th definition, which
considers the min and max to be the 0th and 100th percentiles and equally
spaces the quantiles corresponding to intervening observations. This is a
reasonable description of the sample, but is almost certainly biased as an
estimate of the underlying population quantiles.

Jerry
 
The discrepancy between your reported value of 216.645 and
Doug's reported value of 216.75 for the reported data in Excel 2007 is easily
within the uncertainty that could have been introduced by formatting.
Jerry


Jerry - I don't see how a result of 216.75 displaying as 216.645 could
be a result of formatting. Can you give an example of how that could
happen?
 
Jerry - I don't see how a result of 216.75 displaying as 216.645 could
be a result of formatting. Can you give an example of how that could
happen?

Don't worry - I've worked it out.

If the $867 value in Excel is actually $866.58, formatted to show 0
decimal places, the (correct) 25 percentile will be 216.645.

That is most probably what has happened.
 
Back
Top