trendline equation doesn't fit data

G

Guest

I fit a trendline to some data I have, shown below.
809920.89 1688.50
811026.35 1685.90
812370.50 1680.80
813553.47 1673.80
814357.32 1670.65

the equation given by excel (polynomial 2nd degree fit) is
y=-4e-7x^2+.6676x-267718

however, if you plug the first x point (809920) into this equation, the
value given is over 10,000.

where is the problem?
the trendline appears to fit, but the equation given doesn't fit at all
 
G

Guest

How did you get the fit ?? Using Solver I get:

y=0x^2 - 0.003024729x + 4136.60627

with fitted data:

809920.89 1686.815012
811026.35 1683.471295
812370.5 1679.405605
813553.47 1675.827441
814357.32 1673.396013
 
G

Guest

i just added a trendline, using a 2nd order polynomial fit to better match
the curve of the data. r^2 = .994.
 
G

Guest

I don't understand how your r-squared can be so good if the variation at a
single point is so large. Using the simple linear fit I posted, here are the
x-values, y-values, fitted y-values and the variation-squared at each point:

809920.89 1688.5 1686.815012 2.839185485
811026.35 1685.9 1683.471295 5.898609287
812370.5 1680.8 1679.405605 1.944336986
813553.47 1673.8 1675.827441 4.110518666
814357.32 1670 1673.396013 11.53290394
 
G

Guest

well, i think that the high variation at individual points is due to an error
in excel, i don't believe that the equation it is giving me is the same one
that it is plotting. It isn't set up so the trendline is too high at the
beginning and too low at the end, its about 9,000 ft too high at both ends.
 
J

joeu2004

I fit a trendline to some data I have, shown below.
809920.89 1688.50
811026.35 1685.90
812370.50 1680.80
813553.47 1673.80
814357.32 1670.65
the equation given by excel (polynomial 2nd degree fit) is
y=-4e-7x^2+.6676x-267718

however, if you plug the first x point (809920) into this equation, the
value given is over 10,000.
where is the problem?

The "problem" is: if you use -4e-7 verbatim, it is rounded off too
much for the magnitude of your "x" numbers, e.g. 809920.89^2.

If you double-click the textbox that shows the equation in the chart,
you can format all of the coefficients, say to Scientific with 14
decimal places. When I use the more exact coefficient
-4.13596749594073E-07 in the equation above, I get values closer to
your data (e.g. 1677.36 for 809920.89). Substituting all 3
coefficients, I get 1688.70 for 809920.89 -- a difference of only
0.20.

Although that should answer your question, there is much more to say
about your dubious usage of the trendline feature. I'll post again
when I have more time.
 
J

joeu2004

I fit a trendline to some data I have, shown below.
809920.89 1688.50
811026.35 1685.90
812370.50 1680.80
813553.47 1673.80
814357.32 1670.65
the equation given by excel (polynomial 2nd degree fit) is
y=-4e-7x^2+.6676x-267718

however, if you plug the first x point (809920) into this equation, the
value given is over 10,000.
where is the problem?

As I wrote previously, your problem was basically due to not
formatting -4e-7 to sufficient precision for the magnitude of "x"
values that you have.

You would not have encountered the problem in the first place if you
had used LINEST() to compute the coefficients in your worksheet
formula. Refer to
http://j-walk.com/ss///excel/tips/tip101.htm .

However, there is a more fundamental question to ask yourself,
namely: why are you using a 2nd order polynomial to fit your data?

True, that might give you the illusion of the best fit. But the
highest RSQ alone does not necessarily determine the best fit.

For example, a 2nd order polynomial describes a parabolic curve. In
your case, the vertex (maximum point on the curve) is at about
807083.88. If you were to use the characteristic formula to
extrapolate beyond that point, is the trendline truly behaving the way
the data would?

When I look at your data, I see either a linear or logarithmic curve
with a downward slope. That really depends on your interpretation of
the model that produced the data. I just wanted to express some
suspicious that a 2nd order polynomial is really the right answer.
(But it might be.)

On the other hand, if you do not intend to extrapolate beyond the "x"
limits in your data, it might make sense to use the 2nd order
polynomial because it presumably does provide the closest fit within
those limits.
 
D

Dana DeLouis

Hi. Here's an interesting reference:

Chart trendline formula is inaccurate in Excel
http://support.microsoft.com/kb/211967

I don't have the original post, but with this data...
809920.89 1688.5
811026.35 1685.9
812370.50 1680.8
813553.47 1673.8
814357.32 1670

I get:

= 5163.22295426061-0.00428863203403701*x
or
=-317931.918702048+0.791376677921689*x-4.8985630175318E-07*x^2

Assuming of course that these are the appropriate equations.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top