Excel 2003 Linest Function Bug

  • Thread starter Thread starter Guest
  • Start date Start date
G

Guest

The Linest Function has severe errors when curve fitting my data as follows:

X Y
2499.89 -0.240420
4499.83 -0.432740
6499.77 -0.625070
8999.69 -0.865580
11499.83 -1.106090
13499.91 -1.298610
15999.74 -1.539220
18499.93 -1.779800
20500.05 -1.972495
21999.93 -2.116850
2499.89 -0.240400
4499.83 -0.432740
6499.77 -0.625070
8999.69 -0.865535
11499.83 -1.106125
13499.91 -1.298610
15999.74 -1.539235
18499.93 -1.779850
20500.05 -1.972410
21999.93 -2.116830
2499.89 -0.240430
4499.83 -0.432770
6499.77 -0.625150
8999.69 -0.865625
11499.83 -1.106175
13499.91 -1.298680
15999.74 -1.539240
18499.93 -1.779895
20500.05 -1.972440
21999.93 -2.116840

D C B
A
4.384643E-17 -4.849642E-12 -9.613158E-05 -7.173873E-05

If you change the first 'Y' value to -0.240419 (the real value), the Linest
Equation blows up and is wrong, (Value of 'D' is zero). The results are
below:
D C B A
0 -4.83784E-12 -9.61317E-05 8.03804E-05

Is there any fix by Microsoft in the works or is there something I can do to
ensure data sent to the customer is correct?
 
Your solution may be ill-conditioned. Plot your data and it appears nearly
linear.

If you fit using only two parameters (slope & intercept) you get
a = -9.62307E-05
b = 0.000329459



2499.89 -0.24042 -0.240236766 3.35746E-08
4499.83 -0.43274 -0.432692441 2.26186E-09
6499.77 -0.62507 -0.625148116 6.10209E-09
8999.69 -0.86558 -0.865717228 1.88316E-08
11499.83 -1.10609 -1.106307511 4.73112E-08
13499.91 -1.29861 -1.298776658 2.7775E-08
15999.74 -1.53922 -1.53933711 1.37148E-08
18499.93 -1.7798 -1.779932205 1.74781E-08
20500.05 -1.972495 -1.972405201 8.06385E-09
21999.93 -2.11685 -2.11673974 1.21573E-08
2499.89 -0.2404 -0.240236766 2.66453E-08
4499.83 -0.43274 -0.432692441 2.26186E-09
6499.77 -0.62507 -0.625148116 6.10209E-09
8999.69 -0.865535 -0.865717228 3.32071E-08
11499.83 -1.106125 -1.106307511 3.33104E-08
13499.91 -1.29861 -1.298776658 2.7775E-08
15999.74 -1.539235 -1.53933711 1.04265E-08
18499.93 -1.77985 -1.779932205 6.75762E-09
20500.05 -1.97241 -1.972405201 2.30297E-11
21999.93 -2.11683 -2.11673974 8.14689E-09
2499.89 -0.24043 -0.240236766 3.73393E-08
4499.83 -0.43277 -0.432692441 6.01539E-09
6499.77 -0.62515 -0.625148116 3.55003E-12
8999.69 -0.865625 -0.865717228 8.50604E-09
11499.83 -1.106175 -1.106307511 1.75593E-08
13499.91 -1.29868 -1.298776658 9.34286E-09
15999.74 -1.53924 -1.53933711 9.43036E-09
18499.93 -1.779895 -1.779932205 1.38419E-09
20500.05 -1.97244 -1.972405201 1.21097E-09
21999.93 -2.11684 -2.11673974 1.00521E-08
4.4277E-07
Where the third column is the linear model and the fourth column is the
square of the error.

It may not be meaningful to go beyond two parameters.
 
Jim,

How are you using LINEST? LINEST returns m and b, neither of which is greatly affected by the
change from -0.24042 to -0.240419....

I get:

Y = -9.622911E-05 + 0.000351*X

and

Y = -9.622914E-05 + 0.000352*X

as the straight lines returned by LINEST using your data....

HTH,
Bernie
MS Excel MVP
 
I guess I should clarify the accuracy required. These are high precision
Load Cell calibrations which we hope are in the order of 0.005 percent
accurate. While the curve is almost linear, the second order is very
important and the third less so, but we supply the equation to our
customers. It would be nice to know that it is accurate.
 
Hello
I am using Linest as follows:
=LINEST(C2:C31,B2:B31^{1,2,3},,TRUE)

Highlight 4 cells, cut and paste this equation into the window and do a
control-shift-enter to get out the array of the terms A, B, C, & D in the
curve fit equation:

Y = A + Bx +Cx^2 +Dx^3

As mentioned in the other reply, my data is almost a linear fit, but due to
the accuracy required, I need more terms than Y = A + Bx
Cheers
Jim
--
Jim
National Research Council of Canada


Bernie Deitrick said:
Jim,

How are you using LINEST? LINEST returns m and b, neither of which is greatly affected by the
change from -0.24042 to -0.240419....

I get:

Y = -9.622911E-05 + 0.000351*X

and

Y = -9.622914E-05 + 0.000352*X

as the straight lines returned by LINEST using your data....

HTH,
Bernie
MS Excel MVP
 
Jim,

With your posted data set, I get
4.38464E-17 -4.84964E-12 -9.61316E-05 -7.17387E-05

Changing the first value to .240419 from 24042, I get
4.35682E-17 -4.83784E-12 -9.61317E-05 -7.11578E-05

So I cannot replicate your error.

But, based on the posted data, your greatest absolute error in repeatability is
0.01280%
while the greatest absolute error in your curve fit is:
0.0083549%

So I would think that your 0.005% requirement is optimistic, at best. At this point, if I were you,
I would bring in a statistician (we have some PhD's on staff that help us with this stuff, so I'm
not as good at stats as I should be) to find out what your estimated accuracy/repeatability actually
is, based on your data set.

HTH,
Bernie
MS Excel MVP


JimK said:
Hello
I am using Linest as follows:
=LINEST(C2:C31,B2:B31^{1,2,3},,TRUE)

Highlight 4 cells, cut and paste this equation into the window and do a
control-shift-enter to get out the array of the terms A, B, C, & D in the
curve fit equation:

Y = A + Bx +Cx^2 +Dx^3

As mentioned in the other reply, my data is almost a linear fit, but due to
the accuracy required, I need more terms than Y = A + Bx
Cheers
Jim
 
Bernie Deitrick said:
Jim,

With your posted data set, I get
4.38464E-17 -4.84964E-12 -9.61316E-05 -7.17387E-05

Changing the first value to .240419 from 24042, I get
4.35682E-17 -4.83784E-12 -9.61317E-05 -7.11578E-05

So I cannot replicate your error.

Because you are not using Excel 2003.

Jerry
 
Several examples have been published in these newsgroups were LINEST in Excel
2003 can incorrectly return zero for one or more parameters. These issues
appear to have been fixed in Excel 2007 beta.

There is a post SP-2 hotfix that is supposed to correct some problem with
LINEST
http://support.microsoft.com/kb/887964
but the description of that problem is very obscurely written if it is
intended to fix this issue.

All of the zero parameter estimate problems I have seen up until now dealt
with a well conditioned but essentially orthogonal x matrix where columns of
the x-matrix had the same norm. Your example has a very ill-conditioned x
matrix where the zeroed parameter is very small relative to the other
parameters. In both cases, my best guess is that it is an unfortunate
excessive helpfulness along the lines of =A1-A2 returning zero if A1 and A2
are not exactly equal, but are equal when rounded to 15 decimal figures, as
was introduced in Excel 97
http://support.microsoft.com/kb/78113

Jerry
 
I went back through about 40 past calibration data sets and quickly realized
the old curve fit data, which was calculated and stored using Excel 2000, was
now being corrupted using Excel 2003, (with all its patches), (about 50% of
the data curve fits were wrong). The quick solution, which will keep our
customers happy, is to open all of the data files in Excel 2000, save them,
and write a pdf backup file so this can not happen in the future. Thanks for
the help and a big raspberry to Microsoft for taking a perfectly good math
function and making it so it does not work any more on my data sets.
 
Jim -

Just curious: Why are you using a cubic to fit data that are obviously
linear?

- Mike
 
Hello Mike

We calibrate load cells using dead weights with a known mass to 0.002
percent. We have a standard Excel calibration file which the data is
entered. Not all load cells are this linear. The customer's load cells are
then used to calibrate other people's load cells. I agree that for the
example data, the third order is a bit of a waste of time, but I need a
standard procedure, not one that depends whether the load cell is perfectly
linear or not. If you were weighing gold, I suppose it does make a
difference in the long run.....(they are not)

Jim
 
Back
Top